Troubleshoot BGP routes and route selection

This guide is for troubleshooting issues related to BGP routes, including route selection, route propagation, and route priorities.

For additional troubleshooting information, see the following:

IPv6 BGP session is established but does not exchange IPv4 routes

  1. Verify that the VLAN attachment or HA VPN gateway has the required stack type of IPV4_IPV6. If the stack type is incorrect for the VLAN attachment, modify the VLAN attachment. For a HA VPN gateway, recreate the HA VPN gateway and its tunnels.

  2. Ensure that your Cloud Router is configured properly. Run the following command:

    gcloud compute routers describe ROUTER-NAME
    

    In the output, check the following values:

    • bgpPeers.enableIpv4 is true
    • bgpPeers.ipv4NexthopAddress and bgpPeers.peerIpv4NexthopAddress are present

Some on-premises IPv4 or IPv6 prefixes aren't available

If some on-premises IPv4 or IPv6 prefixes aren't available, check quotas and limits or overlapping subnet ranges.

Custom learned routes are inactive

If you have configured a custom learned route but are experiencing traffic loss, ping errors, or other problems related to the route, do the following:

  • Check that the route is configured properly on the BGP session.
  • Check that the BGP session is up.
  • Check for dropped routes.
  • Check quotas and limits.
  • Check for overlapping subnet ranges.

For more information, see Check the status of custom learned routes.

Check for dropped routes

To see if a route is dropped, run the following command:

gcloud compute routers get-status ROUTER_NAME \
    --region=REGION

Replace the following:

  • ROUTER_NAME: the name of your Cloud Router.
  • REGION: the region that your Cloud Router is located in.

The output is similar to the following:

kind: compute#routerStatusResponse
result:
  bestRoutesForRouter:
  - asPaths:
    - asLists:
      - 65200
      pathSegmentType: AS_SEQUENCE
    creationTimestamp: '2024-03-22T13:57:15.533-07:00'
    destRange: 10.128.0.0/20
    kind: compute#route
    network: https://www.googleapis.com/compute/v1/projects/PROJECT/global/networks/VPC_NAME
    nextHopIp: 169.254.73.246
    nextHopVpnTunnel: https://www.googleapis.com/compute/v1/projects/PROJECT/regions/REGION/vpnTunnels/VPN_NAME
    priority: 100
    routeStatus: ACTIVE
    routeType: BGP
  bgpPeerStatus:
  - advertisedRoutes:
    - destRange: 10.128.0.0/20
      kind: compute#route
      network: https://www.googleapis.com/compute/v1/projects/PROJECT/global/networks/aneta-vpc
      nextHopIp: 169.254.73.245
      nextHopVpnTunnel: https://www.googleapis.com/compute/v1/projects/PROJECT/regions/REGION/vpnTunnels/VPN_NAME
      priority: 100
      routeType: BGP
    enableIpv6: false
    ipAddress: 169.254.73.245
    linkedVpnTunnel: https://www.googleapis.com/compute/v1/projects/PROJECT/regions/REGION/vpnTunnels/VPN_NAME
    md5AuthEnabled: false
    name: aneta-bgp
    numLearnedRoutes: 1
    peerIpAddress: 169.254.73.246
    state: Established
    status: UP
    uptime: 10 hours, 11 minutes, 0 seconds
    uptimeSeconds: '36660'
  network: https://www.googleapis.com/compute/v1/projects/PROJECT/global/networks/VPC_NAME

The bestRoutesForRouter.routeStatus value displays ACTIVE for an active route, and DROPPED for a dropped route.

Check quotas and limits

Check that your Cloud Routers haven't exceeded the limits for learned routes. To view the number of learned routes for a Cloud Router, view its status.

For information about the limits, related log messages, and metrics, and how to resolve issues, see the following table.

Limits Guidance
Learned routes The maximum number of unique destinations for learned routes that are applied to subnets in a given region by all Cloud Routers in the same region.
Logs When you encounter either of these limits, you'll see a limit-exceeded message in Cloud Logging. For information about how to create an advanced query to view this message, see the related query in the logging documentation for Cloud Router.
Metrics

You can also use the following metrics to understand your current limits and usage. These metrics are prepended with router.googleapis.com/dynamic_routes/learned_routes/:

  • used_unique_destinations

    Number of unique destinations that are in use in this VPC network.

  • unique_destinations_limit

    Number of unique destinations that are allowed to advertise in this VPC network.

  • any_dropped_unique_destinations

    Indicates whether this VPC network has any destinations dropped due to exceeding one or both of the route quota limits.

These metrics are available through the gce_network_region monitored resource. For more information about Cloud Router metrics and how to view them, see the Metrics section in Viewing logs and metrics.

Resolving issues

You can do the following to resolve route limit issues. In situations where the number of routes exceeds the limits by a large amount, it makes sense to do both:

  • Configure your on-premises routers to aggregate the routes that you export so that those routes advertise fewer destinations (CIDRs).
  • Contact Support. Support can work with you to reset your Cloud Routers, if needed, or to increase limits.

Check overlapping subnet ranges

Ensure that the IPv4 and IPv6 address ranges for a VPC subnet don't fully overlap with advertised routes from your on-premises network. Overlapping IPv4 and IPv6 ranges can cause routes to be dropped. This also applies to custom static routes that overlap with a dynamic route learned by a Cloud Router. Prefixes received by Cloud Routers are ignored (custom dynamic routes are not created) in the following scenarios:

  • When the prefix learned exactly matches a primary or secondary IPv4 or IPv6 address range of a subnet in your VPC network.

  • When the prefix learned exactly matches the destination of a custom static route in your VPC network.

  • When the prefix learned is more specific (has a longer subnet mask) than a primary or secondary IPv4 or IPv6 address range of a subnet in your VPC network.

  • When the prefix learned is more specific (has a longer subnet mask) than the destination of a custom static route in your VPC network.

For more information, see Applicability and order of routes in the VPC Routes overview.

Routes learned from an on-premises network aren't propagating to other VPC networks

A single Cloud Router can't re-advertise routes learned from one BGP peer to other BGP peers, including to Cloud Routers in other VPC networks.

For example, in the following hub and spoke topology, Cloud Router cannot support route advertisement between multiple VPC networks.

Cloud Router hub and spoke.
Cloud Router hub and spoke (click to enlarge).

To review recommendations for network topologies in Google Cloud, see Best practices and reference architectures for VPC design.

In addition, to build and manage hub and spoke topologies in Google Cloud, you can use Network Connectivity Center.

Prefixes aren't getting imported into BGP sessions (AS path prepending)

AS path prepending is irrelevant to the control plane and VPC network. AS path length is only considered within each Cloud Router software task as described in the following scenarios.

If a single Cloud Router software task learns the same destination from two or more BGP sessions:

  • The software task picks a next hop BGP session that has the shortest AS path length.
  • The software task submits destination, next hop, and MED information to the Cloud Router control plane.
  • The control plane uses the information to create one or more candidate routes. Each candidate's base priority is set to the MED received.

If two or more Cloud Router software tasks learn the same destination from two or more BGP sessions:

  • Each software task picks a next hop BGP session that has the shortest AS path length.
  • Each software task submits destination, next hop, and MED information to the Cloud Router control plane.
  • The control plane uses the information to create two or more candidate routes. Each candidate's base priority is set to the MED received.

The Cloud Router control plane then installs one or more dynamic routes in the VPC network, according to the VPC network's dynamic routing mode. In global dynamic routing mode, the priority of each regional dynamic route is adjusted in regions different from the Cloud Router region. For details about how Google Cloud selects a route, see Routing order in the VPC documentation.

On a multi-NIC VM, each NIC gets different routes

This is the expected behavior. You must configure each network interface controller (NIC) for a multi-NIC VM in a unique VPC network. Each Cloud Router creates custom dynamic routes in one VPC network. Thus, the routes learned by one Cloud Router are only applicable to one network interface of a multi-NIC VM. Packets sent from a VM's network interface use only the routes applicable to the VPC network for that interface.

Traffic is being routed asymmetrically

Traffic is routed asymmetrically when ingress and egress traffic use different paths. For example, you might have two Cloud VPN tunnels. Egress traffic from your VPC network might use the first tunnel, while ingress traffic into your VPC network might use the second tunnel.

Asymmetric routing happens when the preferred path advertised by your on-premises router and the Cloud Router don't align. For ingress traffic into your VPC network, use the Cloud Router to configure advertised route priorities. For more information, see Learned routes.

Check your device documentation for how the BGP best path selection works because other attributes (such as router ID or origin ASN) can affect it. For example, see the following resources:

For egress traffic out of your VPC network, check your on-premises router's MED value.

The default route (0.0.0.0/0 or ::/0) is sending traffic to the internet gateway

When you create a VPC network, Google Cloud automatically creates a default route with a priority of 1000 whose next hop is the default internet gateway.

Routes with a next hop of the default internet gateway can only be used by VMs that meet internet access requirements.

Using routes with a next hop of the default internet gateway is also required to access Google APIs and services—for example, when using Private Google Access.

The following examples describe situations that can cause traffic to the internet or to Google APIs and services to be blocked:

  • If you delete the automatically created default route (the route with a next hop of the default internet gateway).

  • If you replace the automatically created default route, and the next hop of the replacement route is different from the default internet gateway.

  • If a Cloud Router learns a route with destination 0.0.0.0/0 or ::/0 that has a higher priority than the automatically created default route.

The next hop isn't clear

To learn how Google Cloud's route selection algorithm works, see Applicability and order in the VPC Routes documentation.

IPv6 traffic is not being routed

If you are experiencing difficulty connecting to IPv6 hosts, do the following:

  1. Verify that IPv4 routes are being correctly advertised. By checking IPv4 traffic first, you can rule out general network issues. If IPv4 routes are not being advertised, perform the general troubleshooting procedures listed in this document.

  2. Inspect firewall rules to ensure that you are allowing IPv6 traffic between your VPC network and your on-premises network.

  3. Verify that you don't have overlapping IPv6 subnet ranges in your VPC network and your on-premises network. See Check overlapping subnet ranges.

  4. Determine whether you have exceeded any quotas and limits for your learned routes. If you have exceeded your quota for learned routes, IPv6 prefixes are dropped before IPv4 prefixes. See Check quotas and limits.

  5. Verify that all components that require IPv6 configuration have been configured correctly.

    • The VPC subnet is configured to use the IPV4_IPV6 stack type.

    • The VPC subnet has--ipv6-access-type set to INTERNAL.

    • The Compute Engine VMs on the subnet are configured with IPv6 addresses.

    • The HA VPN gateway or the VLAN attachment for Dedicated Interconnect is configured to use the IPV4_IPV6 stack type.

    • The BGP peer is enabled to use IPv6, and correct IPv6 next hop addresses are configured for the BGP session.

What's next