Networking for secure intra-cloud access: Reference architectures

Last reviewed 2023-11-13 UTC

This document is part of a series that describes networking and security architectures for enterprises that are migrating data center workloads to Google Cloud.

The series consists of the following documents:

Workloads for intra-cloud use cases reside in VPC networks and need to connect to other resources in Google Cloud. They might consume services that are provided natively in the cloud, like BigQuery. The security perimeter is provided by a variety of first-party (1P) and third-party (3P) capabilities like firewalls, VPC Service Controls, and network virtual appliances.

In many cases, these workloads span multiple Google Cloud VPC networks, and the boundaries between the VPC networks need to be secured. This document covers these security and connectivity architectures in depth.

Lift-and-shift architecture

The first scenario for an intra-cloud use case is a lift-and-shift architecture where you're moving established workloads to the cloud as is.

Firewall

You can help establish a secure perimeter by configuring firewall rules. You can use network tags to apply fine-grained firewall rules to a collection of VMs. A tag is an arbitrary attribute that's made up of a character string added to the tags field of the VM at the time of VM creation. A tag can also be assigned later by editing the VM. For implementation guidelines on how to manage traffic with Google Cloud firewall rules, see Network firewall policies in the enterprise foundations blueprint.

You can also use firewall logging to audit and verify the effects of the firewall rule setting.

You can use VPC Flow Logs for network forensics and stream the logs to integrate with SIEM. This overall system can provide real-time monitoring, correlation of events, analysis, and security alerts.

Figure 1 shows how firewall rules can use VM tags to help restrict traffic among VMs in a VPC network.

Network firewall configuration that uses network tags to apply fine-grained egress control.

Figure 1. Network firewall configuration that uses network tags to apply fine-grained egress control.

Network virtual appliance

A network virtual appliance (NVA) is a VM that has multiple network interfaces. The NVA lets you connect directly to several VPC networks. Security functions such as web application firewalls (WAF) and security application-level firewalls can be implemented on the VMs. You can use NVAs to implement security functions for east-west traffic, especially when you're using a hub-spoke configuration, as shown in figure 2.

For implementation guidelines about how to use NVAs on Google Cloud, see Centralized network appliances on Google Cloud.

Centralized network appliance configuration in a
Shared VPC network.

Figure 2. Centralized network appliance configuration in a Shared VPC network.

Cloud IDS

Cloud Intrusion Detection System (Cloud IDS) lets you implement native security inspection and logging by mirroring traffic from a subnet in your VPC network. By using Cloud IDS, you can inspect and monitor a wide variety of threats at the network layer and at the application layer for analysis. You create Cloud IDS endpoints in your VPC network that's associated with your Google Cloud project. These endpoints monitor ingress and egress traffic to and from that network, as well as intra-VPC network traffic, by using the packet mirroring functionality that's built into the Google Cloud networking stack. You must enable private services access in order to connect to the service producer project (the Google-managed project) that hosts the Cloud IDS processes.

If you have a hub-and-spoke architecture, traffic from each of the spokes can be mirrored to the Cloud IDS instances, as shown in figure 3.

Cloud IDS configuration to mirror VPC traffic that uses private services access.

Figure 3. Cloud IDS configuration to mirror VPC traffic that uses private services access.

Cloud IDS can be secured in your VPC Service Controls service perimeter using an additional step. You can read more about VPC Service Controls support in supported products.

VPC Network Peering

For applications that span multiple VPC networks, whether they belong to the same Google Cloud project or to the same organization resource, VPC Network Peering enables connectivity between VPC networks. This connectivity lets traffic stay within Google's network so that it does not traverse the public internet.

There are two models for using VPC Network Peering in a lift-and-shift architecture. One is with a "pure" hub-and-spoke architecture, and the other is in a hub-and-spoke architecture with transitivity—where traffic from one spoke can reach another spoke. The following sections provide details about how to use VPC Network Peering with these different architectures.

Hub-and-spoke architecture

A hub-and-spoke architecture is a popular model for VPC connectivity that uses VPC Network Peering. This model is useful when an enterprise has various applications that need to access a common set of services, such as logging or authentication. The model is also useful if the enterprise needs to implement a common set of security policies for traffic that's exiting the network through the hub. In a pure hub-and-spoke model, the traffic exchange between the spokes (known as transitive traffic) is disallowed. Figure 4 shows a pure hub-and-spoke architecture that uses VPC Network Peering to connect the spokes to the hub. For implementation guidelines for building hub-and-spoke networks, see Hub-and-spoke network topology in the enterprise foundations blueprint.

However, If you don't need VPC-level separation, you can use a Shared VPC architecture, which might provide a simpler model for some enterprises that are just starting on Google Cloud.

Hub-and-spoke network architecture that uses VPC Network Peering
for network isolation and non-transitive connectivity.

Figure 4. Hub-and-spoke network architecture that uses VPC Network Peering for network isolation and non-transitive connectivity.

Hub and spoke with transitivity

To enable hub-and-spoke with transitivity (traffic from a spoke can reach other spokes through the hub), there are several approaches that use VPC Network Peering. You can use VPC Network Peering in a full mesh topology, where every VPC network directly peers with every other VPC network that it needs to reach.

Alternatively, you can use an NVA to connect the hub and the spokes together. The NVA then resides behind internal load balancers that are used as the next-hop for traffic from the VPC spokes. Figure 5 shows both of these options.

Additionally, you can use VPNs to connect between the hub and the spoke VPC networks. This arrangement enables reachability across spoke-spoke connections, which provides transitivity across the hub VPC network.

Hub-and-spoke network configuration that uses Cloud VPN for
network-isolation and transitive connectivity.

Figure 5. Hub-and-spoke network configuration that uses Cloud VPN for network-isolation and transitive connectivity.

Shared VPC

You can use Shared VPC, to maintain centralized control over network resources like subnets, routes, and firewalls in host projects. This level of control lets you implement the security best practice of least privilege for network administration, auditing, and access control because you can delegate network administration tasks to network and security administrators. You can assign the ability to create and manage VMs to instance administrators by using service projects. Using a service project ensures that the VM administrators are only given the ability to create and manage instances, and that they are not allowed to make any network-impacting changes in the Shared VPC network.

For example, you can provide more isolation by defining two VPC networks that are in two host projects and by attaching multiple service projects to each network, one for production and one for testing. Figure 6 shows an architecture that isolates a production environment from a testing environment by using separate projects.

For more information about best practices for building VPC networks, see Best practices and reference architectures for VPC design.

Shared VPC network configuration that uses multiple isolated
hosts and service projects (test and production environments).

Figure 6. Shared VPC network configuration that uses multiple isolated hosts and service projects (test and production environments).

Hybrid services architecture

The hybrid services architecture provides additional cloud-native services that are designed to let you connect and secure services in a multi-VPC environment. These cloud-native services supplement what is available in the lift-and-shift architecture and can make it easier to manage a VPC-segmented environment at scale.

Private Service Connect

Private Service Connect lets a service that's hosted in one VPC network be surfaced in another VPC network. There is no requirement that the services be hosted by the same organization resource, so Private Service Connect can be used to privately consume services from another VPC network, even if it's attached to another organization resource.

You can use Private Service Connect in two ways: to access Google APIs or to access services hosted in other VPC networks.

Use Private Service Connect to access Google APIs

When you use Private Service Connect, you can expose Google APIs by using a Private Service Connect endpoint that's a part of your VPC network, as shown in figure 7.

Private Service Connect configuration to send
traffic to Google APIs by using a Private Service Connect
endpoint that's private to your VPC network.

Figure 7. Private Service Connect configuration to send traffic to Google APIs by using a Private Service Connect endpoint that's private to your VPC network.

Workloads can send traffic to a bundle of global Google APIs by using a Private Service Connect endpoint. In addition, you can use a Private Service Connect backend to access a single Google API, extending the security features of load balancers to API services. Figure 8 shows this configuration.

Private Service Connect configuration to send
traffic to Google APIs by using a Private Service Connect
backend.

Figure 8. Private Service Connect configuration to send traffic to Google APIs by using a Private Service Connect backend.

Use Private Service Connect between VPC networks or entities

Private Service Connect also lets a service producer offer services to a service consumer in another VPC network either in the same organization resource or in a different one. A service producer VPC network can support multiple service consumers. The consumer can connect to the producer service by sending traffic to a Private Service Connect endpoint located in the consumer's VPC network. The endpoint forwards the traffic to the VPC network containing the published service.

Private Service Connect configuration to publish
and consume managed services through an endpoint.

Figure 9. Private Service Connect configuration to publish a managed service through a service attachment and consume the service through an endpoint.

VPC serverless access connector

A VPC serverless access connector handles traffic between your serverless environment and your VPC network. When you create a connector in your Google Cloud project, you attach it to a specific VPC network and region. You can then configure your serverless services to use the connector for outbound network traffic. You can specify a connector by using a subnet or a CIDR range. Traffic sent through the connector into the VPC network originates from the subnet or the CIDR range that you specified, as shown in figure 10.

Serverless VPC access connector configuration to
access Google Cloud serverless environments by using internal IP addresses
inside your VPC network.

Figure 10. Serverless VPC access connector configuration to access Google Cloud serverless environments by using internal IP addresses inside your VPC network.

Serverless VPC Access connectors are supported in every region that supports Cloud Run, Cloud Functions, or the App Engine standard environment. For more information, see the list of supported services and supported networking protocols for using VPC Serverless access connector.

VPC Service Controls

VPC Service Controls helps you prevent data exfiltration from services such as Cloud Storage or BigQuery by preventing authorized accesses from the internet or from projects that are not a part of a security perimeter. For example, consider a scenario where human error or incorrect automation causes IAM policies to be set incorrectly on a service such as Cloud Storage or BigQuery. As a result, resources in these services become publicly accessible. In that case, there is a risk of data exposure. If you have these services configured as part of the VPC Service Controls perimeter, ingress access to the resources is blocked, even if IAM policies allow access.

VPC Service Controls can create perimeters based on client attributes such as identity type (service account or user) and network origin (IP address or VPC network).

VPC Service Controls helps mitigate the following security risks:

  • Access from unauthorized networks that use stolen credentials.
  • Data exfiltration by malicious insiders or compromised code.
  • Public exposure of private data caused by misconfigured IAM policies.

Figure 11 shows how VPC Service Controls lets you establish a service perimeter to help mitigate these risks.

VPC service perimeter extended to hybrid
environments by using private access services.

Figure 11. VPC service perimeter extended to hybrid environments by using private access services.

By using ingress and egress rules, you can enable communication between two service perimeters, as shown in figure 12.

Configuring ingress and egress rules to communicate between
service perimeters.

Figure 12. Configuring ingress and egress rules to communicate between service perimeters.

For detailed recommendations for VPC Service Controls deployment architectures, see Design and architect service perimeters. For more information about the list of services that are supported by VPC Service Controls, see Supported products and limitations.

Zero Trust Distributed Architecture

Network perimeter security controls are necessary but not sufficient to support the security principles of least privilege and defense in depth. Zero Trust Distributed Architectures build on, but do not solely rely on, the network perimeter edge for security enforcement. As distributed architectures, they are composed of microservices with per-service enforcement of security policy, strong authentication, and workload identity.

You can implement Zero Trust Distributed Architectures as services managed by Traffic Director and Anthos Service Mesh.

Traffic Director

Traffic Director can be configured to provide a Zero Trust Distributed Architecture microservice mesh inside a GKE cluster by using service security. In this model, in GKE services that have either Envoy sidecars or proxyless gRPC, identity, certificates, and authorization policy are all managed by all of the following: Traffic Director, workload identity, Certificate Authority Service, and IAM. Certificate management and secure naming is provided by the platform, and all service communication is subject to mTLS transport security. Figure 13 shows a cluster with this configuration.

Single-cluster Zero Trust Distributed Architecture mesh that uses
Traffic Director.

Figure 13. Single-cluster Zero Trust Distributed Architecture mesh that uses Traffic Director.

An authorization policy specifies how a server authorizes incoming requests or RPCs. The authorization policy can be configured to allow or deny an incoming request or RPC based on various parameters, such as the identity of the client that sent the request, the host, the headers, and other HTTP attributes. Implementation guidelines are available for configuring authorization policies on meshes based on gRPC and Envoy.

In figure 13, the architecture has a single cluster and flat networking (shared IP address space). Multiple clusters are typically used in Zero Trust Distributed Architecture for isolation, location, and scale.

In more complex environments, multiple clusters can share managed identity when the clusters are grouped by fleets. In that case, you can configure networking connectivity across independent VPC networks by using Private Service Connect. This approach is similar to the hybrid workload access multi-cluster network connectivity approach, as described later in this document.

For information about fine-grained control of how traffic is handled with Traffic Director, see Advanced traffic management overview.

Anthos Service Mesh

Anthos Service Mesh provides an out-of-the-box mTLS Zero Trust Distributed Architecture microservice mesh that's built on Istio foundations. You set up the mesh by using an integrated flow. Managed Anthos Service Mesh, with Google-managed data and control planes, is supported on GKE. An in-cluster control plane is also available, which is suitable for other environments such as Google Distributed Cloud Virtualises or GKE Multi-Cloud. Anthos Service Mesh manages identity and certificates for you, providing an Istio-based authorization policy model.

Anthos Service Mesh relies on fleets for managing multi-cluster service deployment configuration and identity. As with Traffic Director, when your workloads operate in a flat (or shared) VPC network connectivity environment, there are no special network connectivity requirements beyond firewall configuration. When your architecture includes multiple Anthos Service Mesh clusters across separate VPC networks or networking environments, such as across a Cloud Interconnect connection, you also need an east-west gateway. Best practices for networking for Anthos Service Mesh are the same as those that are described in Best practices for GKE networking.

Anthos Service Mesh also integrates with Identity-Aware Proxy (IAP). IAP lets you set fine-grained access policies so that you can control user access to a workload based on attributes of the originating request, such as user identity, IP address, and device type. This level of control enables an end-to-end zero-trust environment.

You need to consider GKE cluster requirements when you use Anthos Service Mesh. For more information, see the Requirements section in the "Single project installation on GKE" documentation.

What's next