Cloud Router is a fully distributed and managed Google cloud service. It scales with your network traffic; it's not a physical device that might cause a bottleneck.
When you extend your on-premises network to Google Cloud, use Cloud Router to dynamically exchange routes between your Google Cloud networks and your on-premises network. Cloud Router peers with your on-premises VPN gateway or router. The routers exchange topology information through Border Gateway Protocol (BGP). Topology changes automatically propagate between your VPC network and on-premises network. You don't need to configure static routes.
Cloud Router works with both legacy networks and Virtual Private Cloud (VPC) networks.
Static versus dynamic routing
With static routes, you must create or maintain a routing table. A topology change on either network requires you to manually update static routes. Also, static routes can't automatically reroute traffic if a link fails.
Static routing is suitable for small networks with stable topologies. You also have strict control over the routing tables. Routers don't send advertisements between networks.
With Cloud Router, you can use BGP to exchange routing information between networks. Instead of manually configuring static routes, networks automatically and rapidly discover topology changes through BGP. Changes are seamlessly implemented without disrupting traffic. This method of exchanging routes through BGP is dynamic routing.
Dynamic routing is suitable for any size network. It frees you from maintaining static routes. Also, if a link fails, dynamic routing can automatically reroute traffic if possible. To enable dynamic routing, create a Cloud Router. Then, configure a BGP session between the Cloud Router and your on-premises gateway or router.
Static routing for VPN tunnels
Without Cloud Router, you can configure your VPN only using static routes. The disadvantages of using static routes are:
- A network configuration change on either side of VPN tunnel requires manually creating and deleting static routes corresponding to these changes. In addition, static routes changes are slow to converge.
- When a VPN tunnel is created to work with static routes, the list of IP prefixes on either side of the tunnel must be specified before the tunnel is created. This means that every time the routes must change, the VPN tunnel has to be updated (deleted and recreated) with the new routes, which disrupts existing traffic.
- There is no standard way to configure static routes. Different vendors use different commands.
In the following example, your combined network consists of your Google cloud network and 29 subnets (one per rack) on the on-premises network on other side of the VPN tunnel. For this example, assume your business is growing such that you need to add a new subnet of machines every week. This week, you're adding subnet 10.0.30.0/24, as shown in the following diagram:
In this scenario, static route based VPN would require the following changes:
- Static routes need to be added to Google Cloud Platform to reach the new on-premises subnet.
- Tearing down and re-creating the VPN tunnel to include the new on-premises subnet.
The configuration changes to static routes and VPN tunnels can be avoided by deploying a Cloud Router. Cloud Router peers with your VPN gateway using BGP to exchange topology information. In effect, network topology changes in your Google Cloud Platform network propagate automatically to your own network and vice versa via BGP so that you don't have to configure static routes for your VPN tunnels.
Dynamic routing for VPN tunnels in VPC networks
Use VPC networks to regionally segment the network IP space into prefixes (subnets) and control which prefix a VM instance's internal IP address is allocated from. To avoid statically managing these subnets, including the burden of adding and removing related static routes for your VPN, use Cloud Router to enable dynamic routing for your VPN tunnel.
A Cloud Router belongs to a specific VPC network and region. Cloud Router advertises subnets from its VPC network to the on-premises (peer) gateway via BGP. It advertises the subnets in its local region or all subnets in the network based on the VPC network's dynamic routing mode. Cloud Router also learns the on-premises routes through BGP and enables the network infrastructure to select the best route to reach the associated prefixes.
The following diagram shows two subnets in a VPC network (Test and Prod) and 29 subnets (one per rack) in the on-premises (peer) network. The two networks are connected through a Cloud VPN tunnel. In this scenario, new subnets are being added in both networks:
- A new subnet in the Google Cloud network (Staging)
- A new rack of machines to handle growing traffic (a new subnet in your data center).
To automatically propagate network configuration changes, the VPN tunnel uses Cloud Router to establish a BGP session between the on-premises VPN gateway, which must support BGP. The new subnets are seamlessly advertised between networks. Instances in the new subnets can start sending and receiving traffic immediately.
To set up BGP, an additional IP address has to be assigned to each end of the
VPN tunnel. These two IP addresses must be link-local IP
addresses , belonging to the IP address range
169.254.0.0/16. These addresses are not
part of IP address space of either network. These two addresses are used
exclusively for establishing a BGP session.
Dynamic routing for VPN tunnels in legacy networks
The following example automatically propagates network configuration changes without the need to reconfigure static routes and restart the VPN tunnel, similar to the previous example.
The BGP session informs each router of local changes. To set up BGP, an
additional IP address has to be assigned to each end of the VPN tunnel. These
two IP addresses must be link-local IP
addresses that belong to the IP address range
169.254.0.0/16. These addresses are not
part of the IP address space on either side of the tunnel and are used
exclusively for configuring BGP peers to establish a BGP session.
You must configure two link-local IP addresses (both from the same subnet) and a netmask on both sides of the tunnel. After you configure these changes on both sides of the tunnel, a BGP session is established.
Again, if a network has VPN tunnels in multiple regions, a Cloud Router has to be created in each region where dynamic routing is desired. A single Cloud Router can be used for multiple VPN gateways and multiple tunnels in the network's region in which the router belongs.
Dynamic routing mode
The dynamic routing mode of a VPC network determines which subnets are visible to Cloud Routers. You can set the dynamic routing mode to be global or regional:
- With global dynamic routing, a Cloud Router advertises all subnets in the VPC network to the on-premises router. Cloud Router propagates learned routes from the on-premises router to all regions.
- With regional dynamic routing, a Cloud Router advertises and propagates routes in its local region.
The dynamic routing mode is configured on the VPC network. When you create or edit a VPC network, you can set the dynamic routing mode to global or regional. All Cloud Router in the VPC network use the network's dynamic routing mode. By default, the mode is regional.
If you change the dynamic routing mode for a VPC network, consider the implications such as interrupting existing connections or enabling unintended routes. For example, if you change to regional dynamic routing, VM instances that could connect to VPN tunnels and interconnects in another region might lose connection. If you change to global dynamic routing, Cloud Router might advertise VM instances from regions that you didn't intend to. To view or configure the dynamic routing mode, see Setting Regional or Global Dynamic Routing.
Regional dynamic routing example
With regional dynamic routing, you might have a Cloud VPN tunnel and VM instances in a single GCP region. The tunnel extends your on-premises network to a VPC network. VM instances in other regions might need to connect to the on-premises network, but they can't reach the tunnel. To get around this constraint, you could create static routes. However, maintaining static routes can be prone to errors and might disrupt traffic.
In the following example, Cloud Router has visibility to resources
only in the
us-west1 region. VM instances in other regions, such as
us-central1, can't reach the Cloud VPN tunnel.
Global dynamic routing examples
With global dynamic routing, Cloud Router has visibility to resources in all regions. For example, if you have VM instances in one region, they can reach a Cloud VPN tunnel in another region automatically without maintaining static routes.
The following example shows a VPC network with global dynamic routing. The Cloud
us-west1 advertises subnets in two different regions:
us-central1. VMs instances in both regions dynamically learn about on-premises
For redundant topologies, dynamic routing (BGP) provides enough information to the VPC and on-premises networks so that when a path fails traffic is rerouted. If a connection in one region has an issue, traffic can fail over to another region.
The following example shows two Cloud VPN tunnels in two different regions. The
VM instances (10.128.0.0/20) use the
tunnel-us-west1 in the
to reach both subnets in the on-premises network. Similarly, the VM instances
us-central1 use the tunnel
The routes are configured so that VM instances prefer their local tunnels (the tunnels in their region). Cloud Router sets different weights for local and remote routes with the same destination. If one tunnel fails, Cloud Router can reroute traffic appropriately.
In the following example,
tunnel-us-west1 fails. Traffic to and from the VM
instances (10.128.0.0/20) is rerouted through
tunnel-us-central1 instead of
Through BGP, Cloud Router advertises the IP addresses that clients in your on-premises network can reach. Your on-premises network sends packets to your VPC network that have a destination IP address matching an advertised IP range. After reaching GCP, your VPC network's firewall rules and routes determine how GCP handles the packets.
You can use Cloud Router's default advertisements or explicitly specify which CIDR ranges to advertise. If you don't specify advertisements, Cloud Router uses the default.
By default, Cloud Router advertises subnets in its region for regional dynamic routing or all subnets in a VPC network for global dynamic routing. New subnets are automatically advertised by Cloud Router. Also, if a subnet has a secondary IP range for configuring alias IP addresses, Cloud Router advertises both the primary and secondary IP addresses.
Each BGP session for a Cloud Router also has a default advertisement. By default, Cloud Router propagates its route advertisements to all of its BGP sessions. If you configure custom route advertisements on a Cloud Router, its BGP sessions inherit those custom advertisements.
To advertise IP addresses outside of a subnet's range, such as reserved external IP addresses, you must specify custom advertisements. Also, when you use custom advertisements, you can selectively advertise subnets or parts of a subnet. That way you can withhold certain subnets from being advertised. If you don't need these capabilities, use the default advertisements.
When you configure custom route advertisements, you explicitly specify the routes that a Cloud Router advertises. In most cases, custom advertisements are useful for supplementing the default subnet advertisement with custom IP addresses. Custom IP addresses are addresses outside of a subnet's IP range, such as reserved external IP addresses. Without custom route advertisements, you would be required to create and maintain static routes for custom IP addresses.
When you configure custom route advertisements, you can choose to advertise all subnets, which emulates the default behavior. You can choose not to advertise all subnets, and instead advertise specific subnets or certain CIDR blocks within a subnet. For example, you might want to prevent Cloud Router from advertising particular subnets. To do that, you advertise just the ones you want to expose. However, when you selectively advertise subnets, new subnets must be manually added to the custom route advertisement. Cloud Router won't automatically advertise new subnets.
You can specify custom route advertisements on a Cloud Router or on a BGP session. Custom route advertisements on the Cloud Router apply to all of its BGP sessions. However, if you specify a custom route advertisement on a BGP session, the Cloud Router's route advertisement is ignored and overridden by the session's advertisement.
For each Cloud Router, you can specify a maximum of 200 CIDR ranges. Also, each BGP session has the same limit of 200.
The following examples show the Cloud Router's default behavior and scenarios where custom route advertisements might be useful. The examples assume an existing connection between the VPC and on-premises networks, such as an IPsec VPN tunnel or Dedicated Interconnect.
Default route advertisement
For regional dynamic routing, Cloud Router advertises subnets in its
region. In the following example, Cloud Router advertises subnets in
us-central1 region. It also also advertises the secondary IP range of
alias-subnet. If you create new subnets in
Cloud Router automatically advertises them. Cloud Router
doesn't advertise IP addresses that aren't included in a subnet's IP range, such
as external IP addresses.
Advertise an external IP address
You might use an external static IP address for a GCP application that serves clients in your on-premises network. When you perform maintenance on the application, you can remap the static IP address to another VM instance to minimize downtime. With Cloud Router's default advertisements, you're required to configure and maintain a static route. Instead, you can use custom advertisements to advertise the external IP address through BGP.
In the following example, the Cloud Router advertises the proxy
server's external IP address
184.108.40.206. The external IP address maps to server's
internal IP address
10.20.0.2. Cloud Router doesn't advertise the
internal IP address of the proxy server or any VM instances in the
subnet. On-premises clients are only aware of the proxy server's external IP
Restrict subnet advertisement
You can prevent instances from being advertised so that the they're hidden. In
the following example, Cloud Router advertises
subnet-2. Clients in the on-premises network can reach VM instances in those
subnets but not instances in the
Route advertisement per BGP session
Assume that you have production and test resources in your VPC and on-premises
networks, and you organized those resources in different subnets. Accordingly,
you set up two BGP sessions that advertise different IP address ranges. By using
two different BGP sessions, traffic bound for one subnet doesn't inadvertently
go to another subnet. The following example shows two BGP sessions:
that advertises only the
test-bgp that advertises only the
Best path for egress traffic from GCP to your on-premises network
If Cloud Router receives multiple routes for the same destination, GCP uses route metrics and, in some cases, AS path length to determine the best path. The following list describes the algorithm used for egress traffic with one or more Cloud Routers managing dynamic routes for one VPC network.
- If you have multiple BGP sessions on a single Cloud Router, egress
is determined by the first condition that is met:
- All egress traffic is sent to the route with the shortest AS path length.
- If routes have the same AS path lengths, all egress traffic is sent to the one with the lowest Multi-Exit Discriminator (MED) value (the lowest route metric).
- If routes have the same AS path lengths and MED values (the same route metrics), egress traffic is balanced across all routes by using Equal-cost multi-path (ECMP).
- If you use multiple Cloud Routers in the same region, GCP only uses
route metrics to determine the best path. The AS path length is not used.
Egress is determined by the first condition that is met:
- All egress traffic is sent to the route with the lowest Multi-Exit Discriminator (MED) value (the lowest route metric).
- If routes have the same MED values (the same route metrics), egress traffic is balanced across all routes by using ECMP.
- Static routes take priority over Cloud Router dyamic routes in cases of conflict. A static route with the same prefix and route metric as a dynamic route always takes priority, so any conflicting dynamic routes are ignored.
Overlapping IP ranges between a VPC subnet and an on-premises route advertisement
In cases where you have a VPC subnet and an on-premises route advertisement with overlapping IP ranges, GCP directs egress traffic depending on their IP ranges.
If the subnet IP range is broader or the same as the on-premises route
advertisement, GCP ignores the on-premises advertisement. All GCP egress traffic
destined for the subnet's IP range is directed to the VPC subnet. For example,
if you have a subnet with an IP range of
10.2.0.0/16 and an on-premises router
10.2.1.0/24, GCP ignores the on-premises advertisement and directs
10.2.0.0/16 traffic to the VPC subnet.
If the subnet IP range is narrower than the on-premises route advertisement,
GCP egress traffic is directed to the VPC subnet if the destination IP is
within the subnet's IP range. All other egreess traffic that matches the
on-premises advertisement is directed to the on-premises network. For example,
if you have a subnet with an IP range of
10.2.0.0/16 and an on-premises router
10.0.0.0/8, GCP directs traffic destined for
10.0.0.0/8 to the
on-premises network unless the destination matches
10.2.0.0/16, which GCP
directs to the subnet.
When Cloud Router advertises or propagates routes, it uses route metrics to specify route priorities. When you have multiple paths between your VPC network and on-premises network, route metrics determine a preferred path. This value is equivalent to the Multi-Exit Discriminator (MED) value. A lower route metric (MED) indicates higher priority.
A route metric is composed of a base advertised route priority and a regional cost. The base priority is a user-specified value, whereas the regional cost is a Google generated value that you can't modify. The regional cost represents the cost of communicating between two regions in a VPC network. Cloud Router adds these two values together to generate a route metric.
For regional dynamic routing, because Cloud Router handles only routes in its region, it doesn't add any regional costs to route metrics. Cloud Router uses only the base advertised route priority.
For global dynamic routing, all Cloud Routers advertise and propagate the same routes. However, each Cloud Router might use different route metrics due to regional costs.
- Base advertised route priority
- When calculating route metrics, Cloud Router starts with the base advertised route priority value and then adds any regional costs. This base value is the minimum route metric for advertised routes. When you configure a BGP session on a Cloud Router, you can specify a base advertised route priority, which applies to all routes for that session. By default, the base advertised priority value is 100.
- The base advertised route priority enables you to set priorities for routes. For example, you might have a VPN tunnel and a dedicated interconnect connecting your VPC and on-premises networks. You can set the base advertised route priority so that traffic prefers the dedicated interconnect. If the interconnect is unavailable, traffic traverses the tunnel. For more information, see the example topologies.
- Regional cost
- When a Cloud Router advertises routes from regions other than its region (routes from remote GCP regions) or propagates routes to remote regions, it adds a regional cost.
- The regional cost can range from 201 to 9999, inclusive. The value depends on the distance, latency, and other factors between two regions. Google generates the regional cost value, and you can’t modify it. For more information about regional costs, see the example topologies.
- Regional costs help prioritize paths based on region proximity. For example, imagine that you have two connections between your VPC and on-premises network, such as two VPN tunnels with their own Cloud Routers. One connection is in us-central1 and another in europe-west1. By adding a regional cost to the route metrics, traffic between networks in us-central1 prioritizes the us-central1 tunnel. Similarly, traffic between networks in europe-west1 prioritize the europe-west1 tunnel. Without regional costs, traffic will be directed equally through both connections, leading to inconsistent network performance.
- For learned routes, Cloud Router adds regional costs when it propagates the learned routes to remote GCP regions. This helps maintains symmetry between ingress and egress traffic between VPC and on-premises networks. Cloud Router adds regional costs to the MED value that's advertised by your on-premises router.
Suggested base priority values
To adjust priorities between routes in a single region, use values less than 201. This guarantees that regional costs won't impact global route priorities. A route from another region (a remote region) can't have a priority lower than 201. If you use higher values, regional costs might impact your route priorities. For example, suppose you have a primary and backup connection. If you set the backup connection's base priority too high, you might unintentionally prefer routes from other regions instead of the backup connection.
To globally deprioritize a route in a VPC network, use values higher than 10,200. This ensures that all other routes lower than 201 have priority regardless of regional costs.
In cases where all routes in a region are equally preferred, you can use the default value of 100.
The following examples explain how regional costs influence route metrics when you use global dynamic routing.
Suppose you have a VPC network with two VPN tunnels with their own Cloud Routers. One tunnel is in us-central1 and the other in us-west1. By default, ingress traffic to those regions use the corresponding tunnels in their respective regions. However, what happens if you want to reach VM instances that aren't in those regions, such as instances in europe-west1? The following diagram shows how the regional costs affect route metrics.
Both Cloud Router advertise routes to europe-west1 but with different route metrics. The Cloud Router in us-central1 advertises routes to europe-west1 with a route metric than us-west1 due to distance, latency, and other factors. The example assumes the regional cost to europe-west1 is 300 through us-central1 and 350 through us-west1. Ingress traffic uses the us-central1 tunnel, which is has a lower route metric. It has a route metric of 350 versus 400 for the us-west1 tunnel.
Similarly, Cloud Router adds a regional cost to the MED value of learned routes (specified by your on-premises router). By default, egress traffic from the europe-west1 region also uses the us-central1 tunnel because it has a lower route metric. This way ingress and egress traffic maintain symmetry.
Route priorities within a region
For redundancy in us-west1, suppose you create a backup tunnel. You specify a higher base priority for the backup tunnel so that ingress traffic to us-west1 prefers the primary tunnel, as shown in the following example:
If the primary tunnel fails, ingress traffic to us-west1 prefers the backup tunnel with a route metric of 51 over the us-central1 tunnel, which has a route metric of 400:
Similarly, for egress traffic from the VPC network to your on-premises network, use MED values lower than 201 to prioritize one path over another. Otherwise, egress traffic from the VPC network to your on-premises network might not be symmetric with ingress traffic.
In cases where all tunnels or interconnects in a region are equally preferred, you can use the default base route priority of 100.
Globally preferred route
Suppose you have a dedicated interconnect and a VPN tunnel in different regions. You want to prioritize the dedicated interconnect because it’s more cost effective for your workloads than the VPN tunnel. Specify a base priority of 10,051 for the VPN tunnel routes to deprioritize it. As a result, all ingress traffic uses the dedicated interconnect, independent of regional costs. Route metrics for the dedicated interconnect won't exceed 10,051. Traffic uses the VPN tunnel only if the interconnect fails.
You’ll also need to make the same adjustments to your on-premises router so that egress traffic from the VPC network to the on-premises network will always prefer to use the dedicated interconnect.
When no route is specified for a particular IP destination, traffic is sent to a default route, which acts like a last resort when no other options exist. For example, GCP VPC networks automatically include a default route (0.0.0.0/0) that sends traffic to the Internet gateway.
In some cases, you might want traffic to be directed to your on-premises network
by default. To do that, you can advertise a default route from your on-premises
router to Cloud Router. With Cloud Router, you don't need to
create and manage static routes. If you advertise a default route from your
on-premises network, check that it's prioritized over other automatically
created default routes (has a lower MED value).
Go to the Routes page
and view the Priority for routes with
Destination IP range and
Default internet gateway in Next
Cloud Router has graceful restart, enabled. Graceful restart allows the on-premises BGP device to go offline and then recover within the graceful restart period without disrupting traffic flow. For Cloud Router, the graceful restart period is two minutes. This feature protects against disruption when BGP agents need software upgrades and other types of maintenance, or fail temporarily. Enable graceful restart on your BGP device if it supports this feature.
Cloud VPN tunnel with graceful restart
In the following example, if Cloud Router requires a maintenance update, it can be updated without causing any disruption to traffic as long as it comes back online within the graceful restart period.
Redundant Cloud VPN tunnels
If the peer (on-premises) gateway doesn't support graceful restart, then a failure on either side of the BGP session causes the session to fail and traffic is disrupted. After the BGP timeout is exceeded, which is 60 seconds for Cloud Router, routes are withdrawn from both sides. Dynamically routed VPN traffic will no longer enter the tunnel. Static routes for the tunnel will continue to be serviced.
Without graceful restart support, deploying two peer gateways, with one tunnel each, provides redundancy and failover. This configuration enables one tunnel and its devices to go offline for software upgrades or maintenance without disrupting traffic. Also, if one tunnel fails, the other tunnel can keep the routes active and traffic flowing.
The following example shows a single box as the Cloud Router, but with two IP addresses. The two addresses are separate ethernet interfaces inside the same Cloud Router task. Each interface is used for a separate BGP session with a separate peer. In this particular use case, since these VPN tunnels are created for redundancy purpose, both BGP sessions exchange exactly same set of route prefixes but with different nexthops that point to different VPN tunnels.
Cloud Router gets upgraded periodically, which takes under 60 seconds. Cloud Router is not available during the upgrade. The BGP hold timer determines how long learned routes are preserved when the peered BGP router is unavailable. BGP hold timer is negotiated to be the lower of the two values from both sides. Cloud Router uses a value of 60 seconds for the BGP hold timer. We recommend that you set your peer BGP hold timer to 60 seconds or greater (the default value is 3 minutes). Then both routers will preserve their routes during these upgrades and traffic continues to flow.
During VPN gateway maintenance cycles with a single VPN gateway, the use of Cloud Router adds about 20 seconds to the tunnel recovery time because the BGP session is reset and routes have to be relearned. VPN gateway recovery times are usually about a minute. If there are redundant VPN gateways, traffic is unaffected because only one VPN gateway is taken down at a time.
For more information about using static and dynamic routing with a supported service, view the following documentation:
|Dedicated Interconnect||Static||Not supported|
|Dedicated Interconnect||Dynamic||Creating VLAN Attachments|
|Cloud VPN||Static||Creating a VPN Tunnel with Static Routes|
|Cloud VPN||Dynamic||Creating a VPN Tunnel with Dynamic Routes|