Troubleshooting

Use the following guide to troubleshoot common issues with Cloud Router:

Configuration issues

BGP session failed to establish

  • Check that the settings on your on-premises BGP router and the settings on your Cloud Router are correct. View the Cloud Router logs for detailed information.
  • If you're creating a Cloud VPN tunnel, check that the status of the tunnel is ESTABLISHED. If it isn't, view Cloud VPN troubleshooting to troubleshoot the issue.

IP addresses for BGP sessions

You must use link-local IP addresses (169.254.0.0/16) for BGP sessions. You cannot use any other external or internal IP address.

Invalid value for field resource.bgp.asn

You may get this error: "Invalid value for field resource.bgp.asn: ######. Local ASN conflicts with peer ASN specified by a router in the same region and network."

Cloud Router is attempting to establish a BGP session with an on-premises device that has the same ASN as the Cloud Router. To resolve this issue, change the ASN of your device or Cloud Router.

iBGP between Cloud Routers in a single region doesn't work

Although you can create two Cloud Routers with the same ASN, iBGP isn't supported.

Cloud Router issues

BGP resets originating from Google Cloud appear on your router

Cloud Router tasks are software processes in the Google Cloud control plane that are normally migrated from machine to machine. During such migrations, Cloud Router might be down for a few seconds. Normal migrations do not cause traffic to be dropped.

Cloud Router is not located in the data path and is not acting as a Layer 3 switch, but as a manager for route programming. Routing is actually handled by the interconnect attachment (VLAN) or the Cloud VPN tunnel.

Route processing issues

On-premises routes without a MED value are taking priority

If Cloud Router receives an on-premises route that doesn't have a MED value, Cloud Router follows the behavior described in RFC 4271. Cloud Router treats the route with the highest priority by assuming the lowest possible MED value (0).

You can't send and learn MED values over an L3 Partner Interconnect connection

If you are using a Partner Interconnect connection where a Layer 3 Partner handles BGP for you, then Cloud Router can't learn MED values from your on-premises router or send MED values to that router. This is because MED values can't pass through autonomous systems. This means that, over this type of connection, you can't set priorities for routes advertised by Cloud Router to your on-premises router, and you can't set route priorities for routes advertised by your on-premises router to your VPC network.

Some on-premises IP prefixes aren't available

Check quotas and limits

Check that your Cloud Routers haven't exceeded the limits for learned routes. To view the number of learned routes for a Cloud Router, view its status. You can reduce the number of advertised destinations (prefixes) by configuring your on-premises router to coalesce them.

You can also view the logs to check whether routes are being dropped because of the limit.

There are two limits that control how many custom dynamic routes Cloud Router can import into a Virtual Private Cloud (VPC) network. These limits don't directly define a maximum number of custom dynamic routes. Instead, they define two limits for the maximum number of unique destination prefixes:

  • The maximum number of unique destinations for learned routes that can be applied to subnets in a given region by all Cloud Routers in the same region
  • The maximum number of unique destinations for learned routes that can be applied to subnets in a given region by Cloud Routers from different regions

The first limit is relevant regardless of the dynamic routing mode used by the VPC network. The second limit only makes sense if the VPC network uses global dynamic routing mode. For details about limits, see the Limits page for Cloud Router.

When you encounter either of these limits, you'll see a limit-exceeded message in Cloud Logging. For information about how to create an advanced query to view this message, see the limit-exceeded query in the logging documentation for Cloud Router.

You can do the following things to resolve route limit issues. In situations where the number of routes exceeds the limits by a large amount, it makes sense to do both:

  • Configure your on-premises routers to aggregate the routes that you export, so that those routes advertise fewer destinations (CIDRs).
  • Contact Support. Support can work with you to reset your Cloud Routers, if needed, or to increase limits.

Check overlapping subnet ranges

Ensure that the IP address ranges for a VPC subnet don't fully overlap with route advertisements from your on-premises network. Overlapping IP ranges can cause routes to be dropped. This also applies to custom static routes that overlap with a dynamic route learned by Cloud Router. Prefixes received by Cloud Routers are ignored (custom dynamic routes are not created) in the following scenarios:

  • When the prefix learned exactly matches a primary or secondary IP address range of a subnet in your VPC network.
  • When the prefix learned exactly matches the destination of a custom static route in your VPC network.
  • When the prefix learned is more specific (has a longer subnet mask) than a primary or secondary IP address range of a subnet in your VPC network.
  • When the prefix learned is more specific (has a longer subnet mask) than the destination of a custom static route in your VPC network.

For more information, see Applicability and order of routes in the VPC Routes overview.

Routes learned from an on-premises network aren't propagating to other VPC networks

A single Cloud Router can't re-advertise routes learned from one BGP peer to other BGP peers, including to Cloud Routers in other VPC networks. The following hub and spoke topology describes this limitation.

Cloud Router hub and spoke (click to enlarge)
Cloud Router hub and spoke (click to enlarge)

Consider the following alternatives for this topology:

  • Create a single VPC network to replace multiple existing VPC networks. Connect the replacement VPC network to your on-premises network by using Cloud VPN or Cloud Interconnect. If you need to maintain a configuration that delegates administrative capabilities among projects with a single VPC network, use Shared VPC. Note that you must recreate virtual machine (VM) instances and other resources in the single replacement network. You cannot simply move them from one network to another.
  • Continue to maintain separate VPC networks. Connect each network to your on-premises network by using Cloud VPN or Cloud Interconnect.
  • Use VPC Network Peering to connect two VPC networks together. Configure the network whose Cloud Router imports routes from on-premises to export its custom routes. For more information, see Importing and exporting custom routes.

Cloud Router doesn't use ECMP across routes with different origin ASNs

For cases where you have multiple on-premises routers connected to a single Cloud Router, the Cloud Router learns and propagates routes from the router with the lowest ASN. Cloud Router ignores advertised routes from routers with higher ASNs, which might result in unexpected behavior. For example, you might have two on-premises routers advertise routes that are using two different Cloud VPN tunnels. You expect traffic to be load balanced between the tunnels, but Google Cloud uses only one of the tunnels because Cloud Router only propagated routes from the on-premises router with the lower ASN.

Prefixes aren't getting imported into BGP sessions (AS path prepending)

AS path prepending is one method that some networks use to prioritize BGP advertisements. In Google Cloud, the advantages of AS path prepending are reduced because VPC networks are global by design.

When a VPC network contains more than one Cloud Router — in one region or multiple regions — it uses the following algorithm to select next hops for the destinations (prefixes) it learns:

  • If a destination (prefix) is received by exactly one BGP session, AS path prepending has no effect. The next hop of each custom dynamic route created in the VPC network is the next hop associated with the one BGP session.
  • If a destination (prefix) is received by more than one BGP session, whether on the same Cloud Router, or on different Cloud Routers, Google Cloud ignores any next hops that have an associated AS path prepending. The next hop of each custom dynamic route created in the VPC network is a next hop whose BGP session does not use AS path prepending. If multiple Cloud Routers are involved, it does not matter whether the Cloud Routers are in the same or different regions.

In all situations, whether one or more custom dynamic routes are created depends on the dynamic routing mode of the VPC network.

For the most consistent results, do not use AS path prepending. Instead, configure the MED value.

On a multi-NIC VM, each NIC gets different routes

This is the expected behavior. You must configure each network interface (NIC) for a multi-NIC VM in a unique VPC network. Each Network Connectivity creates custom dynamic routes in one VPC network. Thus, the routes learned by one Cloud Router are only applicable to one network interface of a multi-NIC VM. Packets sent from a VM's network interface use only the routes applicable to the VPC network for that interface.

Traffic is being routed asymmetrically

Traffic is routed asymmetrically when ingress and egress traffic use different paths. For example, you might have two Cloud VPN tunnels. Egress traffic from your VPC network might use the first tunnel, while ingress traffic into your VPC network might use the second tunnel.

Asymmetric routing happens when the preferred path advertised by your on-premises router and Network Connectivity don't align. For ingress traffic into your VPC network, use Network Connectivity to configure advertised route priorities. For more information, see Best path for egress traffic from Google Cloud to your on-premises network.

For egress traffic out of your VPC network, check your on-premises router's local preference or MED values.

The default route (0.0.0.0/0) is sending traffic to the internet gateway

When you create a VPC network, Google Cloud automatically creates a default route with a priority of 1000 whose next hop is the default internet gateway.

Routes with a next hop of the default internet gateway can only be used by VMs that meet internet access requirements.

Using routes with a next hop of the default internet gateway is also required to access Google APIs and services. For example, when using Private Google Access.

The following examples describe situations that can cause traffic to the internet or to Google APIs and services to be blocked:

  • If you delete the automatically-created default route (the route with a next hop of the default internet gateway)
  • If you replace the automatically-created default route and the next hop of the replacement route is different from the default internet gateway
  • If a Cloud Router learns a route with destination 0.0.0.0/0 that has a higher priority than the automatically-created default route.

The next hop isn't clear

See Applicability and order in the VPC Routes documentation to learn how Google Cloud's route selection algorithm works.