Running Windows Server Failover Clustering

You can create a failover cluster using Windows Server on Google Cloud Platform (GCP). A group of servers works together to provide higher availability (HA) for your Windows applications. If one cluster node fails, another node can take over running the software. You can configure the failover to happen automatically, which is the usual configuration, or you can manually trigger a failover.

This tutorial assumes you are familiar with failover clustering, Active Directory (AD), and administration of Windows Server.

For a brief overview of networking in GCP, see GCP for Data Center Pros: Networking.

Architecture

This tutorial walks you through how to create an example failover cluster on Compute Engine. The example system contains the following three servers:

  • A primary Compute Engine VM instance running Windows Server 2016.
  • A second instance, configured to match the primary instance.
  • An AD domain name server (DNS). This server:

    • Provides a Windows domain.
    • Resolves hostnames to IP addresses.
    • Hosts the file share witness that acts as a third "vote" to achieve the required quorum for the cluster.

You create the AD DNS only to enable this example. In a production system, you can host the file share witness elsewhere, and you don't need a separate AD system only to support your failover cluster. See What's next for links to articles about using AD on GCP.

The following diagram describes the architecture you deploy by following this tutorial.

Architecture diagram showing two Compute Engine VMs in a failover cluster

Understanding the network routing

When the cluster fails over, requests must go to the newly active node. The clustering technology normally handles routing by using address resolution protocol (ARP), which associates IP addresses with MAC addresses. In GCP, the Virtual Private Cloud (VPC) system uses software-defined networking, which doesn't provide MAC addresses. This means the changes broadcast by ARP don't affect routing at all. To make routing work, the cluster requires some software-level help from the Internal Load Balancer.

Usually, internal load balancing distributes incoming network traffic among multiple backend instances that are internal to your VPC, to share the load. For failover clustering, you instead use internal load balancing to route all traffic to just one instance: the currently active cluster node. Here's how internal load balancing detects the correct node:

  • Each VM instance runs a Compute Engine agent instance that provides support for Windows failover clustering. The agent keeps track of the IP addresses for the instance.
  • The load balancer's frontend provides the IP address for incoming traffic to the application.
  • The load balancer's backend provides a health check. The health check process periodically pings the agent on each cluster node by using the fixed IP address of the VM instance through a particular port. The default port is 59998.
  • The health check includes the application's IP address as a payload in the request.
  • The agent compares the IP address in the request to the list of IP addresses for the host VM. If the agent finds a match, it responds with a value of 1. Otherwise, it responds with 0.
  • The load balancer marks any VM that passes the health check as healthy. At any moment, only one VM ever passes the health check because only one VM has the IP address for the workload.

What happens during a failover

When a failover happens in the cluster, the following changes take place:

  • Windows failover clustering changes the status of the active node to indicate that it has failed.
  • Failover clustering moves any cluster resources and roles from the failing node to the best node, as defined by the quorum. This action includes moving the associated IP addresses.
  • Failover clustering broadcasts ARP packets to notify hardware-based network routers that the IP addresses have moved. For this scenario, GCP networking ignores these packets.
  • After the move, the Compute Engine agent on the VM for the failing node changes its response to the health check from 1 to 0, because the VM no longer hosts the IP address specified in the request.
  • The Compute Engine agent on the VM for the newly active node likewise changes its response to the health check from 0 to 1.
  • The internal load balancer stops routing traffic to the failing node and instead routes traffic to the newly active node.

Putting it together

Now that you've reviewed some of the concepts, here are some details to notice about the architecture diagram:

  • The Compute Engine agent for the VM named wsfc-2 is responding to the health check with the value 1, indicating it is the active cluster node. For wsfc-1, the response is 0.
  • The load balancer is routing requests to wsfc-2, as indicated by the arrow.
  • The load balancer and wsfc-2both have the IP address 10.0.0.9. For the load balancer, this is the specified frontend IP address. For the VM, it's the IP address of the application. The failover cluster sets this IP address on the currently active node.
  • The failover cluster and wsfc-2 both have the IP address 10.0.0.8. The VM has this IP address because it currently hosts the cluster resources.

Advice for following this tutorial

This tutorial has a lot of steps. Sometimes, you are asked to follow steps in external documents, such as Microsoft documentation. Don't miss the notes in this document providing specifics for following the external steps.

This tutorial uses Cloud Shell in the Google Cloud Platform Console. Though it's possible to use the GCP Console user interface or the Cloud SDK to set up failover clustering, this tutorial mainly uses Cloud Shell to make it easy for you. This approach helps you to complete the tutorial faster. When more appropriate, some steps use the GCP Console instead.

Cloud Shell

It's a good idea to take snapshots of your Compute Engine persistent disks along the way. If something goes wrong, you can use a snapshot to avoid starting over from the beginning. This tutorial suggests good times to take the snapshots.

If you find that things aren't working as you expect, there might be instructions in the section you're reading. Otherwise, refer to the Troubleshooting section.

Objectives

  • Create a network.
  • Install Windows Server 2016 on two Compute Engine VMs.
  • Install and configure Active Directory on a third instance of Windows Server.
  • Set up the failover cluster, including a file share witness for the quorum and a role for the workload.
  • Set up the internal load balancer.
  • Test the failover operation to verify that the cluster is working.

Costs

This tutorial uses Compute Engine images that include Windows Server licenses. This means the cost to run this tutorial can be significant if you leave VMs running. It's a good idea to stop the VMs when you're not using them.

See the Pricing Calculator for an estimate of the costs to complete this tutorial.

Before you begin

  1. Sign in to your Google Account.

    If you don't already have one, sign up for a new account.

  2. Select or create a GCP project.

    Go to the Manage resources page

  3. Make sure that billing is enabled for your project.

    Learn how to enable billing

  4. Enable the Compute Engine API.

    Enable the API

  5. Start an instance of Cloud Shell.
    OPEN CLOUD SHELL

Creating the network

Your cluster requires a custom network. Use VPC to create a custom network and one subnetwork by running gcloud commands in Cloud Shell.

  1. Create the network:

    gcloud compute networks create wsfcnet --subnet-mode custom
    

    The name of the network you created is wsfcnet.

  2. Create a subnetwork. Replace [YOUR_REGION] with a nearby GCP region:

    gcloud compute networks subnets create wsfcnetsub1 --network wsfcnet --region [YOUR_REGION] --range 10.0.0.0/16`
    

    The name of the subnetwork you created is wsfcnetsub1.

Notice that the CIDR range for IP addresses in this subnetwork is 10.0.0.0/16. This is an example range used for this tutorial. In production systems, work with your network administrators to allocate appropriate ranges for IP addresses for your systems.

Create firewall rules

By default, your network is closed to external traffic. You must open ports in the firewall to enable remote connections to the servers. Use gcloud commands in Cloud Shell to create the rules.

  1. For this tutorial, open ports 22 and 3389 on the main network to enable SSH and RDP connections. In the following command, replace [YOUR_IPv4_ADDRESS] with the IP address of the computer you use to connect to your VM instances. In a production system, you can provide an IP address range or a series of addresses.

    gcloud compute firewall-rules create allow-ssh --network wsfcnet --allow tcp:22,tcp:3389 --source-ranges [YOUR_IPv4_ADDRESS]`
    
  2. On the subnetwork, allow all protocols on all ports to enable the servers to communicate with each other. In production systems, consider opening only specific ports, as needed.

    gcloud compute firewall-rules create allow-all-subnet --network wsfcnet --allow all --source-ranges 10.0.0.0/16`
    

    Notice that the source-ranges value matches the CIDR range you used to create the subnetwork.

  3. View your firewall rules:

    gcloud compute firewall-rules list
    

    You should see output similar to the following:

    NAME              NETWORK  DIRECTION  PRIORITY  ALLOW            DENY
    allow-all-subnet  wsfcnet  INGRESS    1000      all
    allow-ssh         wsfcnet  INGRESS    1000      tcp:22,tcp:3389

Enabling failover clustering in Compute Engine

Add the custom metadata to enable failover clustering in the Compute Engine agent. For simplicity, this tutorial uses project-wide metadata, which applies these attributes to all VMs in the project. Other options include adding individual metadata for each VM or creating a configuration file on each VM, as described in the Compute Engine documentation. This tutorial relies on the default behavior for wsfc-addrs and wsfc-agent-port. You don't need to set those values.

gcloud compute project-info add-metadata --metadata enable-wsfc=true

Creating the servers

Next, create the 3 servers. Use the gcloud command in Cloud Shell.

Create the first cluster-node server

Create a new Compute Engine instance. Configure the instance as follows:

  • Name the instance wsfc-1.
  • Set the --zone flag to a zone near you. Replace [YOUR_ZONE]with a convenient zone near you, such as us-central1-a.
  • Set the --machine-type flag to n1-standard-2.
  • Set the--image-project flag to windows-cloud.
  • Set the --image-family flag to windows-2016.
  • Set the--scopes flag to https://www.googleapis.com/auth/compute.
  • Set the --can-ip-forward flag to enable IP forwarding.
  • Set the --private-network-ip flag to 10.0.0.4.
  • Set the network to wsfcnet and the subnetwork to wsfcnetsub1.

Run the following command, replacing [YOUR_ZONE] with the name of your zone:

gcloud compute instances create wsfc-1 --zone [YOUR_ZONE] --machine-type n1-standard-2 --image-project windows-cloud --image-family windows-2016 --scopes https://www.googleapis.com/auth/compute --can-ip-forward --private-network-ip 10.0.0.4 --network wsfcnet --subnet wsfcnetsub1

Create the second cluster-node server

For the second server, follow the same steps, except:

  • Set the instance name to: wsfc-2.
  • Set the --private-network-ip flag to 10.0.0.5.

Replace [YOUR_ZONE] with the name of your zone:

gcloud compute instances create wsfc-2 --zone [YOUR_ZONE] --machine-type n1-standard-2 --image-project windows-cloud --image-family windows-2016 --scopes https://www.googleapis.com/auth/compute --can-ip-forward --private-network-ip 10.0.0.5 --network wsfcnet --subnet wsfcnetsub1

Create the third server for Active Directory

For the domain controller, follow the same steps, except:

  • Set the instance name to: wsfc-dc.
  • Set the --private-network-ip flag to 10.0.0.6.

Replace [YOUR_ZONE] with the name of your zone:

gcloud compute instances create wsfc-dc --zone [YOUR_ZONE] --machine-type n1-standard-2 --image-project windows-cloud --image-family windows-2016 --scopes https://www.googleapis.com/auth/compute --can-ip-forward --private-network-ip 10.0.0.6 --network wsfcnet --subnet wsfcnetsub1

View your instances

You can see the details about the instances you created.

gcloud compute instances list

You will see output similar to the following:

NAME     ZONE        MACHINE_TYPE      PREEMPTIBLE  INTERNAL_IP  EXTERNAL_IP     STATUS
wsfc-1   us-central1-a  n1-standard-2               10.0.0.4     35.203.131.133  RUNNING
wsfc-2   us-central1-a  n1-standard-2               10.0.0.5     35.203.130.194  RUNNING
wsfc-dc  us-central1-a  n1-standard-2               10.0.0.6     35.197.27.2     RUNNING

Create the Compute Engine instance group

Creating an instance group to contain the cluster nodes enables you to create the required internal load balancer. You create the load balancer in an upcoming section. Don't add the domain controller wsfc-dc to the instance group.

Replace [YOUR_ZONE] with the name of your zone:

gcloud compute instance-groups unmanaged create wsfc-group --zone=[YOUR_ZONE]
gcloud compute instance-groups unmanaged add-instances wsfc-group --instances wsfc-1,wsfc-2 --zone [YOUR_ZONE]

Connecting through RDP

The Compute Engine documentation provides details about how to connect to your Windows VM instances by using RDP. You can either:

  • Use an existing client.
  • Add a Chrome RDP plugin to your browser and then connect through the GCP Console.

    Learn how to use RDP

Whenever this tutorial tells you to connect to a Windows instance, use your preferred RDP connection.

Configuring Windows networking

Get the IP address for the GCP gateway. In Cloud Shell, replace [YOUR_REGION] with the name of your region and run:

gcloud compute networks subnets describe wsfcnetsub1 --region [YOUR_REGION]

The output includes the IP address for the gateway, such as:

gatewayAddress: 10.0.0.1

Now, use RDP to connect to wsfc-1, wsfc-2, and wsfc-dc, and repeat the following steps for each instance:

  1. In Server Manager, in the left pane, select Local Server.
  2. In the Properties pane under Ethernet, click Assigned by DHCP.
  3. Right-click Ethernet and select Properties.
  4. Double-click Internet Protocol Version 4 (TCP/IPv4).
  5. Select Use the following IP address.
  6. Enter the IP address that you assigned to the VM when you created it.

    • For wsfc-1, enter "10.0.0.4".
    • For wsfc-2 enter "10.0.0.5".
    • For wsfc-dc enter "10.0.0.6".
  7. For Subnet mask, enter "255.255.0.0".

  8. For Default gateway, enter the IP address of the gateway for wsfcnetsub1. You found this IP address at the start of this section.

  9. For wsfc-1 and wsfc-2, click Use the following DNS server addresses. Note that wsfc-dc is the domain controller, so leave Default gateway empty on that VM.

  10. For Preferred DNS server, enter "10.0.0.6".

  11. Close all the dialog boxes.

    You lose RDP connectivity because these changes reset the virtual network adapter for the VM instance.

  12. Close the RDP session and then reconnect to the instance. If a dialog box from the previous step is still open, close it.

  13. In the properties section for the local server, verify that the Ethernet setting reflects the local server IP address (10.0.0.4, 10.0.0.5,or10.0.0.6). If it doesn't, re-open the Internet Protocol Version 4 (TCP/IPv4) dialog box and update the setting.

This is a good time to take snapshots of wsfc-1 and wsfc-2.

Setting up Active Directory

Now, set up the domain controller.

  1. Use RDP to connect to the server named wsfc-dc.
  2. Set a password for the local Administrator account.
  3. Enable the local Administrator account.
  4. Follow the steps in the Microsoft instructions below to set up the domain controller, with these additional notes. You can use default values for most settings.

    • Select the DNS Server role check box. This step is not specified in the instructions.
    • Select the Restart the destination server automatically if required check box.
    • Promote the file server to a domain controller.
    • During the Add a new forest step, name your domain "WSFC.TEST".
    • Set the NetBIOS domain name to "WSFC" (the default).

    Microsoft Instructions

This is a good time to take a snapshot of wsfc-dc.

Create the domain user account

It can take some time for wsfc-dc to restart. Before joining servers to the domain, use RDP to sign in to wsfc-dc to validate that the domain controller is running.

You need a domain user that has administrator privileges for the cluster servers. Follow these steps:

  1. On the domain controller (wsfc-dc) click Start, and then type dsa to find and open the Active Directory Users and Computers app.
  2. Right-click WSFC.TEST, point to New, and then click User.
  3. For the Full name and the User logon name, enter "clusteruser".
  4. Click Next.
  5. Enter and confirm a password for the user. Select password options in the dialog box. For example, you can set the password to never expire.
  6. Confirm the settings and then click Finish.
  7. Make clusteruser an administrator on wsfc-dc:

    • On wsfc-dc, go to the Active Directory Users and Computers app.
    • Right-click clusteruser, click Add to a group, enter Administrators, and then click OK.

This tutorial uses the WSFC.TEST\clusteruser account as an administrator account wherever such an account is required. In a production system, follow your usual security practices for allocating accounts and permissions. For more information, see Overview of Active Directory accounts needed by a failover cluster.

Join the servers to the domain

Add the two cluster-node servers to the WSFC.TEST domain. Perform the following steps on each cluster-node server (wsfc-1 and wsfc-2):

  1. In Server Manager > Local Server, in the Properties pane, click WORKGROUP.
  2. Click Change.
  3. Select Domain and then enter "WSFC.TEST".
  4. Click OK.
  5. Provide the credentials for WSFC.TEST\clusteruser to join the domain.
  6. Click OK.
  7. Close the dialog boxes and follow the prompts to restart the server.
  8. Make clusteruser an administrator on wsfc-1 and wsfc-2.

    • Double-click Computer Management > Local Users and Groups > Groups > Administrators, and then click Add.
    • Enter "clusteruser" and the click Check names.
    • Click OK.

This is a good point to take snapshots of all three VMs.

Setting up failover clustering

To create and configure the failover cluster:

  1. Use RDP to connect wsfc-1 and wsfc-2.
  2. Follow the steps in the Microsoft instructions below, with these additional notes:

    • Install the Failover Clustering feature on wsfc-1 and wsfc-2. Don't install the Failover Clustering feature on wsfc-dc.
    • Run the Failover Cluster Manager app as the domain user WSFC.TEST\clusteruser. Otherwise, you might encounter permissions issues. It's a good idea to always run Failover Cluster Manager this way or to connect to a server as clusteruser to ensure you have the required permissions.
    • Add wsfc-1 and wsfc-2 to the cluster as nodes.
    • When validating the configuration:

      • On the Testing Options​ page, select Run only tests I select​, ​and then click Next​.
      • On the Test Selection​ page, clear Storage because the Storage​ option will fail when running on Compute Engine (as it would for separate standalone physical servers).

        Common issues you might encounter during cluster validation include:

        • Only one network interface between replicas. You can ignore this one, because it doesn't apply in a cloud-based setup.
        • Windows Updates not the same on both replicas. If you configured your Windows instances to apply updates automatically, one of the nodes might have applied updates that the other hasn't downloaded yet. You should keep the servers in identical configurations.
        • Pending reboot. You've made changes to one of the servers, and it needs a reboot to apply. Don't ignore this one.
        • The servers do not all have the same domain role. You can ignore this one.
        • The servers are not all in the same Organizational Unit (OU). This tutorial doesn't use an OU at all, but in a production system consider putting your cluster in its own OU. The Microsoft instructions describe this best practice.
        • Unsigned drivers were found. You can ignore this one.
    • On the Summary page, you can select Create the cluster now using the validated nodes to continue on to create the cluster, rather than closing the wizard and reopening it.

    • In the Create Cluster Wizard, on the Access point page, name your cluster "testcluster".
    • In the Address field, enter "10.0.0.8".

    MICROSOFT INSTRUCTIONS

Add the cluster administrator

Adding a domain account as an administrator for the cluster enables you to perform actions on the cluster from tools such as Windows PowerShell. Add the clusteruser domain account as a cluster admin.

  1. On the cluster node that hosts the cluster resources, in Failover Cluster Manager, select your cluster in the left pane and then click Properties in the right pane.
  2. Select the Cluster Permissions tab.
  3. Click Add and then add clusteruser.
  4. With clusteruser selected in the Group or user names list, select Full Control in the Permissions pane.
  5. Click Apply and OK.

This is a good point to take snapshots.

Creating the file share witness

You have a two-node failover cluster, but the cluster uses a voting mechanism to decide which node should be active. To achieve a quorum, you can add a file share witness.

This tutorial simply adds a shared folder to the domain controller server. If this server were to go offline at the same time one of the cluster nodes is restarting, the entire cluster could stop working because the remaining server can't vote by itself. For this tutorial, the assumption is that the GCP infrastructure features, such as Live Migration and automatic restart, provide enough reliability to keep the shared folder alive.

If you want to create a more-highly-available file share witness, you have these options:

  • Use a cluster of Windows Servers to provide the share by using Storage Spaces Direct. This Windows Server 2016 feature can provide a highly available share for the quorum witness. For example, you could create a cluster for your Active Directory domain controller to provide both highly available domain services and provide the file share witness at the same time.
  • Use a file server solution such as GlusterFS or Avere vFXT.

Follow these steps to create the file share for the witness:

  1. Connect to wsfc-dc. This server hosts the file share.
  2. In Explorer, browse to the C drive.
  3. In the title bar, click the New Folder button.
  4. Name the new folder "shares".
  5. Double-click the shares folder to open it.
  6. Add a new folder and name it "clusterwitness-testcluster".

Configure sharing for the file share witness

You must set permissions on the file share witness folder to enable the cluster to use it.

  1. From Explorer, right-click the clusterwitness-testcluster folder and select Properties.
  2. On the Sharing tab, click Advanced Sharing.
  3. Select Share this folder.
  4. Click Permissions and then click Add.
  5. Click Object Types, select Computers, and then click OK.
  6. Add the machine account testcluster$.
  7. Give Full Control permissions to testcluster$.
  8. Click Apply and then close all the dialog boxes.

Add the file share witness to the failover cluster

Now, configure the failover cluster to use the file share witness as a quorum vote.

  1. On the computer that hosts the cluster resources (wsfc-1), open the Failover Cluster Manager.
  2. In the left pane, right-click the name of the cluster (testcluster.WSFC.TEST) then point to More Actions, and then click Configure Cluster Quorum Settings.
  3. Step through the wizard pages by using the Next button at each step.
  4. For the quorum configuration option, choose Select the quorum witness.
  5. Choose Configure a file share witness.
  6. For the File Share Path, enter the path to the shared folder, such as "\10.0.0.6\clusterwitness-testcluster". In this example, 10.0.0.6 is the IP address for the wsfc-dc VM.
  7. Confirm the settings and then click Finish.

Testing the failover cluster

Your Windows Server failover cluster should now be working. You can test manually moving cluster resources between your instances. You're not done yet, but this is a good checkpoint to validate that everything you've done so far is working.

  1. On wsfc-1, note the name of the Current Host Server in Failover Cluster Manager.
  2. Run Windows PowerShell as clusteruser.
  3. In PowerShell, run the following command to change the current host server:

    Move-ClusterGroup -Name "Cluster Group"
    

You should see the name of the current host server change to the other VM.

If this didn't work, review the previous steps and see if you missed anything. The most common issue is a missing firewall rule that is blocking access on the network. Refer to the Troubleshooting section for more issues to check.

Otherwise, you can now move on to setting up the internal load balancer, which is required in order to route network traffic to the current host server in the cluster.

This is a good time to take snapshots.

Adding a role

In Windows failover clustering, roles host clustered workloads. You can use a role to specify in the cluster the IP address that your application uses. For this tutorial, you add a role for the test workload, which is the Internet Information Services (IIS) web server. Follow these steps:

  1. In Failover Cluster Manager, in the Actions pane, select Configure Role.
  2. In the Select Role page, select Other Server.
  3. In the Client Access Point page, enter the name "IIS".
  4. Set the address to "10.0.0.9".
  5. Skip Select Storage and Select Resource Types.
  6. Confirm the settings and then click Finish.

Confirmation dialog shows settings for role.

Creating the internal load balancer

Now, create and configure the internal load balancer, which is required in order to route network traffic to the active cluster host node. You will use the GCP Console, because the user interface gives you a good view into how internal load balancing is organized.

  1. In the GCP Console, go to the Load balancing page.

    OPEN LOAD BALANCING

  2. Click Create Load Balancer.

  3. On the TCP Load Balancing card, click Start configuration.
  4. Select Only between my VMs and then click Continue.
  5. For Name, enter "wsfc-lb".

Don't click Create yet.

Configure the backend

Recall that the GCP internal load balancer uses a periodic health check to determine the active node. The health check pings the Compute Engine cluster host agent that is running on the active cluster node. The health check payload is the IP address of the application, which is represented by the clustered role. The agent responds with a value of 1 if the node is active or 0 if it is not.

  1. Click Backend configuration.
  2. Select your current region.
  3. Select wsfcnet for Network.
  4. Select wsfc-group for Instance group.
  5. Create a health check.

    • For Name, enter "wsfc-hc".
    • Accept the default Protocol setting of TCP and change the Port to "59998" for cluster host agent responses.
    • For Request, enter "10.0.0.9".
    • For Response, enter "1".
    • For Check interval, enter "2".
    • For Timeout enter "1".
    • Click Save and continue.

Configure the frontend

The frontend configuration creates a forwarding rule that defines how the load balancer handles incoming requests. For this tutorial, to keep it simple, you will test the system by making requests between the VMs in the subnetwork.

In your production system, you probably want to open the system up to external traffic, such as Internet traffic. To do this, you can create a bastion host that accepts external traffic and forwards it to your internal network. Using a bastion host is not covered in this tutorial.

  1. In the center pane, click Frontend configuration.
  2. For Name, enter "wsfc-lb-fe".
  3. Select your subnetwork (wsfcnetsub1).
  4. For IP, select Ephemeral (Custom).
  5. Enter "10.0.0.9". This is the same IP address you set for the role.
  6. For Ports, enter "80".
  7. Click Done.

Review and finalize

  1. To see a summary of the internal load balancer settings, in the center pane, click Review and finalize. The summary appears in the right pane.
  2. Click Create. It takes a moment to create the load balancer.

    GCP Console shows final settings for internal load balancing.

Create firewall rules for the health check

You might have noticed that the GCP Console notified you that the health-check system would require a firewall rule to enable the health checks to reach their targets. In this section, you set up the firewall rule.

  1. Return to the Cloud Shell in the GCP Console.

    OPEN CLOUD SHELL

  2. Run the following command to create the firewall rule:

    gcloud compute firewall-rules create allow-health-check --network wsfcnet --source-ranges 130.211.0.0/22,35.191.0.0/16 --allow tcp:59998`
    

Open the Windows Firewall

Now, create a Windows Firewall rule on each cluster node (wsfc-1 and wsfc-2). At a minimum, allow all inbound TCP connections through port 59998 for IP addresses 130.211.0.0/22 and 35.191.0.0/16.

Validating the load balancer

After your internal load balancer is running, you can inspect its status to validate that it can find a healthy instance, and then test failover again.

  1. Return to the Load balancing page in the GCP Console.

    OPEN LOAD BALANCING

  2. Click the name of the load balancer (wsfc-lb).

    In the Backend section of the summary, you should see the instance group listed.

    In the Healthy column, you should see: 1 / 2

    This is the expected result. Of your two cluster nodes, only one is active at any time in the failover cluster, so the load balancer health check only works for that node.

    Even if you don't see the proper result in the Healthy column, proceed to the next step. Sometimes, you need to do at least one failover action to get the load balancer to find the IP address.

  3. To fail over, right-click the IIS role in Failover Cluster Manager and then click Move > Best Possible Node. This action moves the role to the new node shown in the Owner Node field:

    Owner Node field shown in failover cluster manager.

  4. Wait until the Status shows Running.

  5. Return to the Load balancing page, click Refresh, and verify that the Healthy column still shows 1 / 2.

    Load balancer status shows 1 healthy instance out of 2.

Tip: You can use the gcloud tool to check which instance is healthy, where [REGION] is your region:

gcloud compute backend-services get-health wsfc-lb --region=[REGION]

The output looks like the following:

backend: https://www.googleapis.com/compute/v1/projects/[PROJECT_NAME]/zones/us-west1-a/instanceGroups/wsfc-group
status:
  healthStatus:
  - healthState: UNHEALTHY
    instance: https://www.googleapis.com/compute/v1/projects/[PROJECT_NAME]/zones/us-west1-a/instances/wsfc-1
    ipAddress: 10.0.0.4
    port: 80
  - healthState: HEALTHY
    instance: https://www.googleapis.com/compute/v1/projects/[PROJECT_NAME]/zones/us-west1-a/instances/wsfc-2
    ipAddress: 10.0.0.5
    port: 80
  kind: compute#backendServiceGroupHealth

Installing your application

Now that you have a cluster, you can set up your application on each node and configure it for running in a clustered environment.

For this tutorial, you need to set up something that can demonstrate that the cluster is really working with the internal load balancer. Set up IIS on each VM to serve a simple web page.

You're not setting up IIS for HA in the cluster. You are creating separate IIS instances that each serve a different web page. After a failover, the web server serves its own content, not shared content.

Setting up your application or IIS for HA is beyond the scope of this tutorial.

Set up IIS

  1. On each cluster node, install IIS.

    • Be sure that Default Document is selected under Common HTTP Features.
    • On the Confirmation page, select the checkbox that enables automatic restarting of the destination server.
  2. Validate that each web server is working.

    1. Use RDP to connect to the VM named wsfc-dc.
    2. In Server Manager, in the Properties section at the top, turn off IE Enhanced Security Configuration.
    3. Open Internet Explorer.
    4. Browse to the IP address of each server:

      http://10.0.0.4/
      http://10.0.0.5/
      

In each case, you see the Welcome page, which is the default IIS web page.

Edit the default web pages

Change each default web page so you can easily see which server is currently serving the page.

  1. Use RDP to connect to the VM named wsfc-1.
  2. Run Notepad as administrator.
  3. Open C:\inetpub\wwwroot\iistart.htm in Notepad. Remember to browse for All Files, not just text files.
  4. In the <title> element, change the text to the name of the current server. For example:

        <title>wsfc-1</title>
    
  5. Save the HTML file.

  6. Repeat these steps for wsfc-2, setting the <title> element to wsfc-2.

Now, when you view a web page served from one of these servers, the name of the server appears as the title in the Internet Explorer tab.

Test the failover

  1. Use RDP to connect to the VM named wsfc-dc.
  2. Open Internet Explorer.
  3. Browse to the IP address of the load balancer role:

    http://10.0.0.9/
    

    You see the Welcome page with the name of the current server displayed in the tab title.

  4. Stop the current server to simulate a failure. In Cloud Shell, run the following command, replacing [INSTANCE_NAME] with the name of the current server you saw in the previous step, such as wsfc-1:

    gcloud compute instances stop [INSTANCE_NAME]
    
  5. Switch to your RDP connection to wsfc-dc.

    It can take a few moments for the load balancer to detect the move and reroute the traffic.

  6. After 30 seconds or so, refresh the page in Internet Explorer.

You should now see the name of the new active node displayed in the tab title. For example, if you started with wsfc-1 active, you now see wsfc-2 in the title. If you don't see the change right away or see a page-not-found error, refresh the browser again.

Congratulations! You now have a working Windows Server 2016 failover cluster running on GCP.

Troubleshooting

Here are some common issues you can check if things aren't working.

GCP firewall rules blocks health check

If the health check isn't working, double-check that you have a firewall rule to enable incoming traffic from the IP addresses that the health check system uses: 130.211.0.0/22and 35.191.0.0/16.

Windows Firewall blocks health check

Make sure port 59998 is open in Windows Firewall on each cluster node.

Cluster nodes using DHCP

It's important that each VM in the cluster has a static IP address. If a VM is configured to use DHCP in Windows, change the networking settings in Windows to make the IPv4 address match the IP address of the VM as shown in the GCP Console. Also set the gateway IP address to match the address of the subnetwork gateway in the GCP VPC.

GCP network tags in firewall rules

If you use network tags in your firewall rules, be sure the correct tags are set on every VM instance. This tutorial doesn't use tags, but if you've set them for some other reason, they must be used consistently.

Cleaning up

To avoid incurring charges to your Google Cloud Platform account for the resources used in this tutorial:

After you've finished the failover clustering tutorial, you can clean up the resources you created on Google Cloud Platform so you won't be billed for them in the future. The following sections describe how to delete or turn off these resources.

Deleting the project

The easiest way to eliminate billing is to delete the project you created for the tutorial.

To delete the project:

  1. In the GCP Console, go to the Projects page.

    Go to the Projects page

  2. In the project list, select the project you want to delete and click Delete project. After selecting the checkbox next to the project name, click
      Delete project
  3. In the dialog, type the project ID, and then click Shut down to delete the project.

Deleting instances

To delete a Compute Engine instance:

  1. In the GCP Console, go to the VM Instances page.

    Go to the VM Instances page

  2. Click the checkbox next to the instance you want to delete.
  3. Click the Delete button at the top of the page to delete the instance.

Deleting persistent disks

To delete a persistent disk:

  1. In the GCP Console, go to the Disks page.

    Go to the Disks page

  2. Select the checkbox next to the name of the disk you want to delete.

  3. Click the Delete button at the top of the page.

What's next

Was this page helpful? Let us know how we did:

Send feedback about...

Compute Engine Documentation