Setting up serverless NEGs

A network endpoint group (NEG) specifies a group of backend endpoints for a load balancer. A serverless NEG is a backend that points to a Cloud Run, App Engine, or Cloud Functions service.

A serverless NEG can represent:

  • A Cloud Run service or a group of services sharing the same URL pattern.
  • A Cloud Functions function or a group of functions sharing the same URL pattern.
  • An App Engine app (Standard or Flex), a specific service within an app, or even a specific version of an app.

This page shows you how to create an external HTTP(S) load balancer to route requests to serverless backends. Here the term serverless refers to the following serverless compute products: App Engine, Cloud Functions, and Cloud Run (fully managed).

Serverless NEGs allow you to use Google Cloud serverless apps with external HTTP(S) Load Balancing. After you configure a load balancer with the serverless NEG backend, requests to the load balancer are routed to the serverless app backend.

Before you begin

Before you begin:

  1. Read the Serverless NEGs overview.
  2. Deploy a App Engine, Cloud Functions, or Cloud Run (fully managed) service.
  3. Install Google Cloud SDK.
  4. Configure permissions.
  5. Add an SSL certificate resource.

Deploy an App Engine, Cloud Functions, or Cloud Run (fully managed) service

The instructions on this page assume you already have a Cloud Run (fully managed), Cloud Functions, or App Engine service running.

For the example on this page, we have used the Cloud Run (fully managed) Python quickstart to deploy the helloworld service in the us-central1 region. The rest of this page shows you how to set up an external HTTP(S) load balancer that uses a serverless NEG backend to route requests to the helloworld service.

If you haven't already deployed a serverless app, or if you want to try a serverless NEG with a sample app, use one of the following quickstarts. You can create a serverless app in any region, but you must use the same region later on to create the serverless NEG and load balancer.

Cloud Run (fully managed)

To create a simple Hello World application, package it into a container image, and then deploy the container image to Cloud Run (fully managed), see Quickstart: Build and Deploy.

If you already have a sample container uploaded to the Container Registry, see Quickstart: Deploy a Prebuilt Sample Container.

Cloud Functions

See Cloud Functions: Python Quickstart.

App Engine

See the following App Engine quickstart guides for Python 3:

Install Google Cloud SDK

Install the gcloud command-line tool. See gcloud Overview for conceptual and installation information about the tool.

If you haven't run the gcloud command-line tool previously, first run gcloud init to initialize your gcloud directory.

This is required because you cannot use the Google Cloud Console to create a serverless NEG.

Configure permissions

To follow this guide, you need to create a serverless NEG and create an external HTTP(S) load balancer in a project. You should be either a project owner or editor, or you should have the following Compute Engine IAM roles:

Task Required Role
Create load balancer and networking components Network Admin
Create and modify NEGs Compute Instance Admin
Create and modify SSL certificates Security Admin

Create an SSL certificate resource

If you want to create an HTTPS load balancer, you must add an SSL certificate resource to the load balancer's front end. Create an SSL certificate resource using either a self-managed SSL certificate or a Google-managed SSL certificate. Using Google-managed certificates is recommended because Google Cloud obtains, manages, and renews these certificates automatically.

This example assumes that you have already created an SSL certificate resource.

Reserving an external IP address

Now that your services are up and running, set up a global static external IP address that your customers use to reach your load balancer.

Console

  1. Go to the External IP addresses page in the Google Cloud Console.
    Go to the External IP addresses page
  2. Click Reserve static address to reserve an IPv4 address.
  3. Assign a Name of example-ip.
  4. Set the Network tier to Premium.
  5. Set the IP version to IPv4.
  6. Set the Type to Global.
  7. Click Reserve.

gcloud

gcloud compute addresses create example-ip \
    --ip-version=IPV4 \
    --global

Note the IPv4 address that was reserved:

gcloud compute addresses describe example-ip \
    --format="get(address)" \
    --global

Creating the external HTTP(S) load balancer

In the following diagram, the load balancer uses a serverless NEG backend to direct requests to a serverless Cloud Run (fully managed) service. For this example, we have used the Cloud Run (fully managed) Python quickstart to deploy the helloworld-python service.

Simple HTTPS Load Balancing (click to enlarge)
HTTPS Load Balancing for a Cloud Run app

Because health checks are not supported for backend services with serverless NEG backends, you don't need to create a firewall rule allowing health checks if the load balancer has only serverless NEG backends.

gcloud

  1. Create a serverless NEG for your Cloud Run (fully managed) service. For this example, we assume you have deployed a Cloud Run (fully managed) service called helloworld.
    gcloud beta compute network-endpoint-groups create helloworld-serverless-neg \
        --region=us-central1 \
        --network-endpoint-type=SERVERLESS  \
        --cloud-run-service=helloworld
    
    To create a NEG for an App Engine service or a Cloud Functions function, see the gcloud reference guide for gcloud beta compute network-endpoint-groups create.
  2. Create a backend service and add the serverless NEG as a backend to the backend service:
    gcloud compute backend-services create helloworld-backend-service \
        --global
    
    gcloud beta compute backend-services add-backend helloworld-backend-service \
        --global \
        --network-endpoint-group=helloworld-serverless-neg \
        --network-endpoint-group-region=us-central1
    
  3. Create a URL map to route incoming requests to the helloworld-backend-service backend service:
    gcloud compute url-maps create helloworld-url-map \
        --default-service helloworld-backend-service
    
    This example URL map only targets one backend service representing a single Cloud Run (fully managed) service, so we don’t need to set up host rules or path matchers. If you have more than one backend service, you can use host rules to direct requests to different services based on the host name, and you can set up path matchers to direct requests to different services based on the request path.
  4. If you want to create an HTTPS load balancer, you must have an SSL certificate resource to use in the HTTPS proxy. You can create an SSL certificate resource using either a self-managed SSL certificate or a Google-managed SSL certificate. Using Google-managed certificates is recommended because Google Cloud obtains, manages, and renews these certificates automatically. To create a Google-managed certificate, you must have a domain. If you do not have a domain, you can use a self-signed SSL certificate for testing. To create a self-managed SSL certificate resource called www-ssl-cert:
    gcloud compute ssl-certificates create www-ssl-cert \
        --certificate [CRT_FILE_PATH] \
        --private-key [KEY_FILE_PATH]
    
    To create a Google-managed SSL certificate resource called www-ssl-cert:
    gcloud compute ssl-certificates create www-ssl-cert \
        --domains [DOMAIN]
    
  5. Create a target HTTPS proxy to route requests to your URL map. The proxy is the portion of the load balancer that holds the SSL certificate for HTTPS Load Balancing, so you also load your certificate in this step.
    gcloud compute target-https-proxies create helloworld-https-proxy \
        --ssl-certificates=www-ssl-cert \
        --url-map=helloworld-url-map
    
  6. Create a global forwarding rule to route incoming requests to the proxy.
    gcloud compute forwarding-rules create https-content-rule \
        --address=example-ip \
        --target-https-proxy=helloworld-https-proxy \
        --global \
        --ports=443
    

Testing the external HTTP(S) load balancer

Now that you have configured your load balancer, you can start sending traffic to the load balancer's IP address.

Console

  1. Go to the Load balancing page in the Google Cloud Console.
    Go to the Load balancing page
  2. You can test your load balancer using a web browser by going to https://ip-address where ip-address is the load balancer's IP address. You should be directed to the helloworld service homepage.

gcloud/ using curl

Use the curl command to test the response from the URL. Replace ip-address with the load balancer's IPv4 address. You should be directed to the helloworld service homepage.

curl https://ip-address

Additional configuration options

This section expands on the configuration example to provide alternative and additional configuration options. All of the tasks are optional. You can perform them in any order.

Setting up multi-region load balancing

In the example described above we have only one Cloud Run (fully managed) service serving as the backend. Because the serverless NEG can only point to one endpoint at a time, load balancing isn't actually performed. The external HTTP(S) load balancer serves as the frontend only, and it proxies traffic to the specified helloworld app endpoint. However, you might want to serve your Cloud Run (fully managed) app from more than one region to improve your service’s availability as well as to improve latency for users.

If a backend service contains several NEGs, the load balancer balances traffic by forwarding requests to the serverless NEG in the closest available region. However, backend services can only contain one serverless NEG per region. To make your Cloud Run (fully managed) service available from multiple regions you will need to set up cross-region routing. You should be able to use a single URL scheme that works anywhere in the world, yet serves user requests from the region closest to the user. If the closest region is unavailable or is short on capacity, the request will be routed to a different region.

To set up multi-region serving, you will need to use the Premium network tier to ensure that all the regional Cloud Run (fully managed) deployments are compatible and ready to serve traffic from any region.

To set up a multi-region load balancer:

  1. Set up two Cloud Run (fully managed) services in different regions. Let's assume you have deployed two Cloud Run (fully managed) services: one to a region in Europe, and another to a region in the US.
  2. Create an external HTTP(S) load balancer load balancer with the following setup:
    1. Set up a global backend service with two serverless NEGs.
      1. Create the first NEG in the same region as the Cloud Run service deployed in Europe.
      2. Create the second NEG in the same region as the Cloud Run service deployed in the US.
    2. Set up your frontend configuration with the Premium network tier.

The rest of the setup can be the same as described previously. Your resulting setup should look like this:

Distributing traffic to serverless apps (click to enlarge)
Multi-region routing for serverless apps (with failover)

Setting up regional routing

A common reason for serving applications from multiple regions is to meet data locality requirements. For example, you might want to ensure that requests made by European users are always served from a region located in Europe. To set this up you need a URL schema with separate URLs for EU and non-EU users, and direct your EU users to the EU URLs.

In such a scenario, you would use the URL map to route requests from specific URLs to their corresponding regions. With such a setup, requests meant for one region are never delivered to a different region. This provides isolation between regions. On the other hand, when a region fails, requests are not routed to a different region. So this setup does not increase your service’s availability.

To set up regional routing, you will need to use the Premium network tier so that you can combine different regions in a single forwarding rule.

To set up a load balancer with regional routing:

  1. Set up two Cloud Run (fully managed) services in different regions. Let's assume you have deployed two Cloud Run (fully managed) services: hello-world-eu to a region in Europe, and hello-world-us to a region in the US.
  2. Create an external HTTP(S) load balancer load balancer with the following setup:
    1. Set up a backend service with a serverless NEG in Europe. The serverless NEG must be created in the same region as the Cloud Run (fully managed) service deployed in Europe.
    2. Set up a second backend service with another serverless NEG in the US. This serverless NEG must be created in the same region as the Cloud Run service deployed in the US.
    3. Set up your URL map with the appropriate host and path rules so that one set of URLs routes to the European backend service while all the requests route to the US backend service.
    4. Set up your frontend configuration with the Premium network tier.

The rest of the setup can be the same as described previously. Your resulting setup should look like this:

Distributing traffic to serverless apps (click to enlarge)
Regional routing for serverless apps (no failover)

Using a URL mask

When creating a serverless NEG, instead of selecting a specific Cloud Run (fully managed) service, you can use a URL mask to point to multiple services serving at the same domain. A URL mask is a template of your URL schema. The serverless NEG will use this template to extract the service name from the incoming request's URL and map the request to the appropriate service.

URL masks are particularly useful if your service is mapped to a custom domain rather than the default address that Google Cloud provides for the deployed service. A URL mask allows you to target multiple services and versions with a single rule even when your application is using a custom URL pattern.

If you haven't already done so, make sure you read Serverless NEGS overview: URL Masks.

Constructing a URL mask

To construct a URL mask for your load balancer, start with the URL of your service. For this example, we will use a sample serverless app running at https://example.com/login. This is the URL where the app's login service will be served.

  1. Remove the http or https from the URL. You are left with example.com/login.
  2. Replace the service name with a placeholder for the URL mask.
    1. Cloud Run (fully managed): Replace the Cloud Run (fully managed) service name with the placeholder <service>. If the Cloud Run (fully managed) service has a tag associated with it, replace the tag name with the placeholder <tag>. In this example, the URL mask you are left with is, example.com/<service>.
    2. Cloud Functions: Replace the function name with the placeholder <function>. In this example, the URL mask you are left with is, <function>.example.com.
    3. App Engine: Replace the service name with the placeholder <service>. If the service has a version associated with it, replace the version with the placeholder <version>. In this example, the URL mask you are left with is, example.com/<service>.
  3. (Optional) If the service name (or function, version, or tag) can be extracted from the path portion of the URL,the domain can be omitted. The path part of the URL mask is distinguished by the first / character. If a / is not present in the URL mask, the mask is understood to represent the host only. Therefore, for this example, the URL mask can be reduced to /<service> or /<function>.

    Similarly, if the service name can be extracted from the host part of the URL, you can omit the path altogether from the URL mask.

    You can also omit any host or subdomain components that come before the first placeholder as well as any path components that come after the last placeholder. In such cases, the placeholder captures the required information for the component.

Here are a few more examples that demonstrate these rules:

Cloud Run

This table assumes that you have a custom domain called example.com and all your Cloud Run (fully managed) services are being mapped to this domain.

Service, Tag name Cloud Run (fully managed) custom domain URL URL mask
service: login https://login-home.example.com/web <service>-home.example.com
service: login https://example.com/login/web example.com/<service> or /<service>
service: login, tag: test https://test.login.example.com/web <tag>.<service>.example.com
service: login, tag: test https://example.com/home/login/test example.com/home/<service>/<tag> or /home/<service>/<tag>
service: login, tag: test https://test.example.com/home/login/web <tag>.example.com/home/<service>

Cloud Functions

This table assumes that you have a custom domain called example.com and all your Cloud Functions services are being mapped to this domain.

Function Name Cloud Functions custom domain URL URL Mask
login https://example.com/login /<function>
login https://example.com/home/login /home/<function>
login https://login.example.com <function>.example.com
login https://login.home.example.com <function>.home.example.com

App Engine

This table assumes that you have a custom domain called example.com and all your App Engine services are being mapped to this domain.

Service name, version App Engine custom domain URL URL mask
service: login https://login.example.com/web <service>.example.com
service: login https://example.com/home/login/web example.com/home/<service>, or /home/<service>
service: login, version: test https://test.example.com/login/web <version>.example.com/<service>
service: login, version: test https://example.com/login/test example.com/<service>/<version>

Creating a serverless NEG with a URL mask

gcloud: Cloud Run

To create a Serverless NEG with a sample URL mask of example.com/<service>:

gcloud beta compute network-endpoint-groups create helloworld-serverless-neg \
    --region=us-central1 \
    --network-endpoint-type=SERVERLESS \
    --cloud-run-url-mask=example.com/<service>

gcloud: Cloud Functions

To create a Serverless NEG with a sample URL mask of example.com/<function>:

gcloud beta compute network-endpoint-groups create helloworld-serverless-neg \
    --region=us-central1 \
    --network-endpoint-type=SERVERLESS \
    --cloud-function-url-mask=example.com/<function>

gcloud: App Engine

To create a Serverless NEG with a sample URL mask of example.com/<service>:

gcloud beta compute network-endpoint-groups create helloworld-serverless-neg \
    --region=us-central1 \
    --network-endpoint-type=SERVERLESS \
    --app-engine-url-mask=example.com/<service>

To learn how the load balancer handles issues with URL mask mismatches, see Troubleshooting issues with serverless NEGs.

Moving your custom domain to be served by the external HTTP(S) load balancer

If your serverless compute apps are being mapped to custom domains, you might want to update your DNS records so that traffic sent to the existing Cloud Run (fully managed), Cloud Functions, or App Engine custom domain URLs is routed through the load balancer instead.

For example, if you have a custom domain called example.com and all your Cloud Run services are mapped to this domain, you should update the DNS record for example.com to point to the load balancer's IP address.

Before updating the DNS records, you can test your configuration locally by forcing local DNS resolution of the custom domain to the load balancer's IP address. To test locally, either modify the /etc/hosts/ file on your local machine to point example.com to the load balancer's IP address, or, use the curl --resolve flag to force curl to use the load balancer's IP address for the request.

When the DNS record for example.com resolves to the HTTP(S) load balancer’s IP address, requests sent to example.com begin to be routed via the load balancer. The load balancer dispatches them to the relevant backend service according to its URL map. Additionally, if the backend service is configured with a URL mask, the serverless NEG uses the mask to route the request to the appropriate Cloud Run (fully managed), Cloud Functions, or App Engine service.

Enabling Cloud CDN

Enabling Cloud CDN for your Cloud Run (fully managed) service allows you to optimize content delivery by caching content close to your users.

You can enable Cloud CDN on the external HTTP(S) load balancer's backend service using the gcloud compute backend-services update command.

  gcloud compute backend-services update helloworld-backend-service \
    --enable-cdn \
    --global

Cloud CDN is supported for backend services with Cloud Run (fully managed), Cloud Functions, and App Engine backends.

Enabling Google Cloud Armor

Google Cloud Armor is a security product that provides protection against Distributed Denial of Service (DDoS) attacks to all GCLB proxy load balancers. Google Cloud Armor also provides configurable security policies to services accessed through an external HTTP(S) load balancer. To learn about Google Cloud Armor security policies for HTTP(S) Load Balancing, see Google Cloud Armor security policy overview.

While Google Cloud Armor can be configured for backend services with Cloud Run (fully managed), Cloud Functions, and App Engine backends, there are certain limitations associated with this capability, especially with Cloud Run (fully managed) and App Engine. Users who have access to the default URLs assigned to these services by Google Cloud can bypass the load balancer and go directly to the service URLs, circumventing any configured Google Cloud Armor security policies.

If you are using Cloud Functions, you can mitigate this by using the internal-and-gclb ingress setting to ensure that requests sent to default cloudfunctions.net URLs or any other custom domain set up through Cloud Functions are blocked.

Deleting a serverless NEG

A network endpoint group cannot be deleted if it is attached to a backend service. Before you delete a NEG, ensure that it is detached from the backend service.

To remove a serverless NEG from a backend service, you must specify the region where the NEG was created. You must also specify the --global flag because helloworld-backend-service is a global resource.

gcloud beta compute backend-services remove-backend helloworld-backend-service \
    --network-endpoint-group=helloworld-serverless-neg \
    --network-endpoint-group-region=us-central1 \
    --global

To delete the serverless NEG:

gcloud beta compute network-endpoint-groups delete helloworld-serverless-neg \
    --region=us-central1

Stop sending traffic to the backend service

Removing all serverless NEGs from a backend service doesn't automatically result in 502 Bad Gateway responses for users. This is a known issue. Users continue to see HTTP 200 OK responses to their requests even after all serverless NEG backends have been removed from a backend service. If you want to change this behavior:

  1. Create a new backend service with zero backends.
  2. Update the load balancer's URL map to use this new backend service for all requests by default.
  3. Remove the original backend service that had the serverless NEGs attached to it.

Users will now see HTTP 502 Bad Gateway responses (or other errors) as expected.

What's next