HA cluster configuration guide for SAP NetWeaver on SLES

This guide shows you how to deploy and configure a performance-optimized SUSE Linux Enterprise Server (SLES) high-availability (HA) cluster for SAP NetWeaver system.

This guide includes the steps for:

  • Configuring Internal TCP/UDP Load Balancing to reroute traffic in the event of a failure
  • Configuring a Pacemaker cluster on SLES to manage the SAP systems and other resources during a failover

This guide also includes steps for configuring the SAP NetWeaver system for HA, but refer to the SAP documentation for the definitive instructions.

For information about deploying Compute Engine VMs for SAP NetWeaver that is not specific to high-availability, see the SAP NetWeaver deployment guide that is specific to your operating system.

This guide is intended for advanced SAP NetWeaver users who are familiar with Linux high-availability configurations for SAP NetWeaver.

The system that this guide deploys

Following this guide, you will deploy two SAP NetWeaver instances and set up an HA cluster on SLES. You deploy each SAP NetWeaver instance on a Compute Engine VM in a different zone within the same region. A high-availability installation of the underlying database is not covered in this guide.

Overview of a high-availability Linux cluster for a single-node SAP NetWeaver system

The deployed cluster includes the following functions and features:

  • Two host VMs, one with an active SAP Central Services and one with an active Standalone Enqueue Server
  • The Pacemaker high-availability cluster resource manager.
  • A STONITH fencing mechanism.
  • Automatic restart of the failed instance as the new secondary instance.

This guide has you use the Cloud Deployment Manager templates that are provided by Google Cloud to deploy the Compute Engine virtual machines (VMs), which ensures that the VMs meet SAP supportability requirements and conform to current best practices.

Prerequisites

Before you create the SAP NetWeaver high availability cluster, make sure that the following prerequisites are met:

Except where required for the Google Cloud environment, the information in this guide is consistent with the following related guides from SUSE:

Creating a network

For security purposes, create a new network. You can control who has access by adding firewall rules or by using another access control method.

If your project has a default VPC network, don't use it. Instead, create your own VPC network so that the only firewall rules in effect are those that you create explicitly.

During deployment, VM instances typically require access to the internet to download Google's monitoring agent. If you are using one of the SAP-certified Linux images that are available from Google Cloud, the VM instance also requires access to the internet in order to register the license and to access OS vendor repositories. A configuration with a NAT gateway and with VM network tags supports this access, even if the target VMs do not have external IPs.

To set up networking:

  1. Go to Cloud Shell.

    Go to Cloud Shell

  2. To create a new network in the custom subnetworks mode, run:

    gcloud compute networks create NETWORK_NAME --subnet-mode custom

    Replace NETWORK_NAME with the name of the new network. The network name can contain only lowercase characters, digits, and the dash character (-).

    Specify --subnet-mode custom to avoid using the default auto mode, which automatically creates a subnet in each Compute Engine region. For more information, see Subnet creation mode.

  3. Create a subnetwork, and specify the region and IP range:

    gcloud compute networks subnets create SUBNETWORK_NAME \
            --network NETWORK_NAME --region REGION --range RANGE

    Replace the following:

    • SUBNETWORK_NAME: the name of the new subnetwork.
    • NETWORK_NAME: the name of the network you created in the previous step.
    • REGION: the region where you want the subnetwork.
    • RANGE: the IP address range, specified in CIDR format, such as 10.1.0.0/24. If you plan to add more than one subnetwork, assign non-overlapping CIDR IP ranges for each subnetwork in the network. Note that each subnetwork and its internal IP ranges are mapped to a single region.
  4. Optionally, repeat the previous step and add additional subnetworks.

Setting up a NAT gateway

If you need to create one or more VMs without public IP addresses, you need to use network address translation (NAT) to enable the VMs to access the internet. Use Cloud NAT, a Google Cloud distributed, software-defined managed service that lets VMs send outbound packets to the internet and receive any corresponding established inbound response packets. Alternatively, you can set up a separate VM as a NAT gateway.

To create a Cloud NAT instance for your project, see Using Cloud NAT.

After you configure Cloud NAT for your project, your VM instances can securely access the internet without a public IP address.

Adding firewall rules

By default, incoming connections from outside your Google Cloud network are blocked. To allow incoming connections, set up a firewall rule for your VM. Firewall rules regulate only new incoming connections to a VM. After a connection is established with a VM, traffic is permitted in both directions over that connection.

You can create a firewall rule to allow access to specified ports, or to allow access between VMs on the same subnetwork.

Create firewall rules to allow access for such things as:

  • The default ports used by SAP NetWeaver, as documented in TCP/IP Ports of All SAP Products.
  • Connections from your computer or your corporate network environment to your Compute Engine VM instance. If you are unsure of what IP address to use, talk to your company's network admin.
  • Communication between VMs in a 3-tier, scaleout, or high-availability configuration. For example, if you are deploying a 3-tier system, you will have at least 2 VMs in your subnetwork: the VM for SAP NetWeaver, and another VM for the database server. To enable communication between the two VMs, you must create a firewall rule to allow traffic that originates from the subnetwork.
  • Cloud Load Balancing health checks. For more information, see Create a firewall rule for the health checks.

To create a firewall rule:

  1. In the Cloud Console, go to the Firewall Rules page.

    Open Firewall Rules page

  2. At the top of the page, click Create firewall rule.

    • In the Network field, select the network where your VM is located.
    • In the Targets field, select All instances in the network.
    • In the Source filter field, select one of the following:
      • IP ranges to allow incoming traffic from specific IP addresses. Specify the range of IP addresses in the Source IP ranges field.
      • Subnets to allow incoming traffic from a particular subnetwork. Specify the subnetwork name in the following subnets field. You can use this option to allow access between the VMs in a 3-tier or scaleout configuration.
    • In the Protocols and ports section, select Specified protocols and ports and specify tcp:[PORT_NUMBER];.
  3. Click Create to create your firewall rule.

Deploying the VMs for SAP NetWeaver

Before you begin configuring the HA cluster, you define and deploy the VM instances that will serve as the primary and secondary nodes in your HA cluster.

To define and deploy the VMs, you use the same Cloud Deployment Manager template that you use to deploy a VM for an SAP NetWeaver system in the Automated VM deployment for SAP NetWeaver on Linux.

However, to deploy two VMs instead of one, you need to add the definition for the second VM to the configuration file by copying and pasting the definition of the first VM. After you create the second definition, you need to change the resource and instance names in the second definition. To protect against a zonal failure, specify a different zone in the same region. All other property values in the two definitions stay the same.

After the VMs have deployed successfully, you install SAP NetWeaver and define and configure the HA cluster.

The following instructions use the Cloud Shell, but are generally applicable to the Cloud SDK.

  1. Open Cloud Shell.

    Go to Cloud Shell

  2. Download the YAML configuration file template, template.yaml, to your working directory:

    wget https://storage.googleapis.com/cloudsapdeploy/deploymentmanager/latest/dm-templates/sap_nw/template.yaml

  3. Optionally, rename the template.yaml file to identify the configuration it defines. For example, nw-ha-sles15sp1.yaml.

  4. Open the YAML configuration file in the Cloud Shell code editor by clicking the pencil () icon in the upper right corner of Cloud Shell terminal window to launch the editor.

  5. In the YAML configuration file template, define the first VM instance. You define the second VM instance in the next step after the following table.

    Specify the property values by replacing the brackets and their contents with the values for your installation. The properties are described in the following table. For an example of a completed configuration file, see Example of a complete YAML configuration file.

    Property Data type Description
    name String An arbitrary name that identifies the deployment resource that the following set of properties define.
    type String

    Specifies the location, type, and version of the Deployment Manager template to use during deployment.

    The YAML file includes two type specifications, one of which is commented out. The type specification that is active by default specifies the template version as latest. The type specification that is commented out specifies a specific template version with a timestamp.

    If you need all of your deployments to use the same template version, use the type specification that includes the timestamp.

    instanceName String The name for the VM instance that you are defining. Specify different names in the primary and secondary VM definitions. Consider using names that identify the instances as belonging to the same high-availability cluster.

    Instance names must be 13 characters or less and be specified in lowercase letters, numbers, or hyphens. Use a name that is unique within your project.

    instanceType String The type of Compute Engine VMs that you need. Specify the same instance type for the primary and secondary VMs.

    If you need a custom VM type, specify a small predefined VM type and, after deployment is complete, customize the VM as needed.

    zone String The Google Cloud zone in which to deploy the VM instance that your are defining. Specify different zones in the same region for the primary and secondary VM definitions. The zones must be in the same region that you selected for your subnet.
    subnetwork String The name of the subnetwork that you created in a previous step. If you are deploying to a shared VPC, specify this value as [SHAREDVPC_PROJECT]/[SUBNETWORK]. For example, myproject/network1.
    linuxImage String The name of the Linux operating-system image or image family that you are using with SAP NetWeaver. To specify an image family, add the prefix family/ to the family name. For example, family/sles-15-sp1-sap. For the list of available image families, see the Images page in the Cloud Console.
    linuxImageProject String The Google Cloud project that contains the image you are going to use. This project might be your own project or the Google Cloud image project suse-sap-cloud. For a list of Google Cloud image projects, see the Images page in the Compute Engine documentation.
    usrsapSize Integer The size of the `/usr/sap` disk. The minimum size is 8 GB.
    sapmntSize Integer The size of the `/sapmnt` disk. The minimum size is 8 GB.
    swapSize Integer The size of the swap volume. The minimum size is 1 GB.
    networkTag String

    Optional. One or more comma-separated network tags that represents your VM instance for firewall or routing purposes.

    For high-availability configurations, specify a network tag to use for a firewall rule that allows communication between the cluster nodes and a network tag to use in a firewall rule that allows the Cloud Load Balancing health checks to access the cluster nodes.

    If you specify `publicIP: No` and do not specify a network tag, be sure to provide another means of access to the internet.

    serviceAccount String

    Optional. Specifies a custom service account to use for the deployed VM. The service account must include the permissions that are required during deployment to configure the VM for SAP.

    If serviceAccount is not specified, the default Compute Engine service account is used.

    Specify the full service account address. For example, sap-ha-example@example-project-123456.iam.gserviceaccount.com

    publicIP Boolean Optional. Determines whether a public IP address is added to your VM instance. The default is Yes.
    sap_deployment_debug Boolean Optional. If this value is set to Yes, the deployment generates verbose deployment logs. Do not turn this setting on unless a Google support engineer asks you to enable debugging.
  6. In the YAML configuration file, create the definition of the second VM by copying the definition of the first VM and pasting the copy after the first definition. For an example, see Example of a complete YAML configuration file.

  7. In the definition of the second VM, specify different values for the following properties than you specified in the first definition:

    • name
    • instanceName
    • zone
  8. Create the VM instances:

    gcloud deployment-manager deployments create [DEPLOYMENT_NAME] --config [TEMPLATE_NAME].yaml

    where:

    • [DEPLOYMENT_NAME] represents the name of your deployment.
    • [TEMPLATE_NAME] represents the name of your YAML configuration file.

    The preceding command invokes the Deployment Manager, which deploys the VMs according to the specifications in the YAML configuration file.

    Deployment processing consists of two stages. In the first stage, Deployment Manager writes its status to the console. In the second stage, the deployment scripts write their status to Cloud Logging.

Example of a complete YAML configuration file

The following example shows a completed YAML configuration file that deploys two VM instances for an HA configuration for SAP NetWeaver by using the latest version of the Deployment Manager templates. The example omits the comments that the template contains when you first download it.

The file contains the definitions of two resources to deploy: sap_nw_node_1 and sap_nw_node_2. Each resource definition contains the definitions for a VM.

The sap_nw_node_2 resource definition was created by copying and pasting the first definition, and then modifying the values of name, instanceName, and zone properties. All other property values in the two resource definitions are the same.

The properties networkTag and serviceAccount are from the Advanced Options section of the configuration file template.

resources:
- name: sap_nw_node_1
  type: https://storage.googleapis.com/cloudsapdeploy/deploymentmanager/latest/dm-templates/sap_nw/sap_nw.py
  properties:
    instanceName: nw-ha-vm-1
    instanceType: n2-standard-4
    zone: us-central1-b
    subnetwork: example-sub-network-sap
    linuxImage: family/sles-15-sp2-sap
    linuxImageProject: suse-sap-cloud
    usrsapSize: 15
    sapmntSize: 15
    swapSize: 24
    networkTag: cluster-ntwk-tag,allow-health-check
    serviceAccount: limited-roles@example-project-123456.iam.gserviceaccount.com
- name: sap_nw_node_2
  type: https://storage.googleapis.com/cloudsapdeploy/deploymentmanager/latest/dm-templates/sap_nw/sap_nw.py
  properties:
    instanceName: nw-ha-vm-2
    instanceType: n2-standard-4
    zone: us-central1-c
    subnetwork: example-sub-network-sap
    linuxImage: family/sles-15-sp2-sap
    linuxImageProject: suse-sap-cloud
    usrsapSize: 15
    sapmntSize: 15
    swapSize: 24
    networkTag: cluster-ntwk-tag,allow-health-check
    serviceAccount: limited-roles@example-project-123456.iam.gserviceaccount.com

Create firewall rules that allow access to the host VMs

If you haven't done so already, create firewall rules that allow access to each host VM from the following sources:

  • For configuration purposes, your local workstation, a bastion host, or a jump server
  • For access between the cluster nodes, the other host VMs in the HA cluster
  • The health checks that are used by Cloud Load Balancing, as described in the later step Create a firewall rule for the health checks.

When you create VPC firewall rules, you specify the network tags that you defined in the template.yaml configuration file to designate your host VMs as the target for the rule.

To verify deployment, define a rule to allow SSH connections on port 22 from a bastion host or your local workstation.

For access between the cluster nodes, add a firewall rule that allows all connection types on any port from other VMs in the same subnetwork.

Make sure that the firewall rules for verifying deployment and for intra-cluster communication are created before proceeding to the next section. For instructions, see Adding firewall rules.

Verifying the deployment of the VMs

Before you install SAP NetWeaver or begin configuring the HA cluster, verify that the VMs were deployed correctly by checking the logs and the OS storage mapping.

Checking the logs

The following steps use Cloud Logging, which might incur charges. For more information, see Cloud Logging pricing.

  1. Open Cloud Logging to check for errors and monitor the progress of the installation.

    Go to Logging

  2. On the Resources tab, select Global as your logging resource. If INSTANCE DEPLOYMENT COMPLETE is displayed for a VM, Deployment Manager processing is complete for the VM.

    Logging display

Checking the configuration of the VMs

  1. After the VM instances deploy, connect to the VMs by using ssh.

    1. If you haven't already done so, create a firewall rule to allow an SSH connection on port 22.
    2. Go to the VM Instances page.

      Go to the VM Instances page

    3. Connect to each VM instance by clicking the SSH button on the entry for each VM instance, or you can use your preferred SSH method.

      SSH button on Compute Engine VM instances page.

  2. Display the file system:

    ~> df -h

    Ensure that you see output similar to the following:

    Filesystem                 Size  Used Avail Use% Mounted on
    devtmpfs                    32G  8.0K   32G   1% /dev
    tmpfs                       48G     0   48G   0% /dev/shm
    tmpfs                       32G  402M   32G   2% /run
    tmpfs                       32G     0   32G   0% /sys/fs/cgroup
    /dev/sda3                   30G  3.4G   27G  12% /
    /dev/sda2                   20M  3.7M   17M  19% /boot/efi
    /dev/mapper/vg_usrsap-vol   15G   48M   15G   1% /usr/sap
    /dev/mapper/vg_sapmnt-vol   15G   48M   15G   1% /sapmnt
    tmpfs                      6.3G     0  6.3G   0% /run/user/1002
    tmpfs                      6.3G     0  6.3G   0% /run/user/0
  3. Confirm that the swap space was created:

    ~> cat /proc/meminfo | grep Swap

    You see results similar to the following example:

    SwapCached:            0 kB
    SwapTotal:      25161724 kB
    SwapFree:       25161724 kB

If any of the validation steps show that the installation failed:

  1. Correct the error.
  2. On the Deployments page, delete the deployment to clean up the VMs and persistent disks from the failed installation.
  3. Rerun your deployment.

Update the Cloud SDK

The Deployment Manager template installed the Cloud SDK on the VMs during deployment. Update the Cloud SDK to ensure that it includes all of the latest updates.

  1. SSH into the primary VM.

  2. Update the Cloud SDK:

    ~>  sudo gcloud components update
  3. Follow the prompts.

  4. Repeate the steps on the secondary VM.

Enable load balancer back-end communication between the VMs

After you have confirmed that the VMs deployed successfully, enable backend communication between the VMs that will serve as the nodes in your HA cluster.

  1. On each VM in the HA cluster, enable local routing.

    1. SSH into each VM in your planned cluster.
    2. Switch to the root user.
    3. On each machine, enable local routing on the primary interface by issuing the following command. If you are using a different interface than eth0, specify that interface in the command instead of eth0.

      echo net.ipv4.conf.eth0.accept_local=1 >> /etc/sysctl.conf
      sysctl -p

      The preceding command writes the setting to the configuration file.

  2. On each VM, create a startup script to enable backend-to-backend communication:

    Cloud Console

    1. Go to the VM instances page in the Cloud Console

      Go to the VM Instances page

    2. Click on the name of the primary VM.

    3. On the VM instance details page, click the EDIT button.

    4. In the Custom metadata section, click Add item.

    5. In the Key field, specify startup-script.

    6. In the Value field, paste the following bash script:

      #! /bin/bash
      # VM startup script
      
      nic0_mac="$(curl -H "Metadata-Flavor:Google" \
      --connect-timeout 5 --retry 5 --retry-max-time 60 \
      http://169.254.169.254/computeMetadata/v1/instance/network-interfaces/0/mac)"
      
      nic0_ip="$(curl -H "Metadata-Flavor:Google" \
      --connect-timeout 5 --retry 5 --retry-max-time 60 \
      http://169.254.169.254/computeMetadata/v1/instance/network-interfaces/0/ip)"
      
      for nic in $(ls /sys/class/net); do
      nic_addr=$(cat /sys/class/net/"${nic}"/address)
      if [ "$nic_addr" == "$nic0_mac" ]; then
        nic0_name="$nic"
        break
      fi
      done
      
      [[ -n $nic0_name ]] && [[ -n $nic0_ip ]] \
      && logger -i "gce-startup-script: INFO adding IP configuration for ILB client" \
      || logger -i "gce-startup-script: ERROR could not determine IP or interface name"
      
      if [ -n "$nic0_name" ]; then
      ip rule del from all lookup local
      ip rule add pref 0 from all iif "${nic0_name}" lookup local
      ip route add local "${nic0_ip}" dev "${nic0_name}" proto kernel \
        scope host src "${nic0_ip}" table main
      ip route add local 127.0.0.0/8 dev lo proto kernel \
        scope host src 127.0.0.1 table main
      ip route add local 127.0.0.1 dev lo proto kernel \
        scope host src 127.0.0.1 table main
      ip route add broadcast 127.0.0.0 dev lo proto kernel \
        scope link src 127.0.0.1 table main
      ip route add broadcast 127.255.255.255 dev lo proto kernel \
        scope link src 127.0.0.1 table main
      fi
    7. At the bottom of the page, click Save.

    8. Reboot the server for the startup script to take effect.

      When you are done, your Custom metadata should look similar to the following example:

      Screen capture shows "startup-script" with other entries in the
Custom metadata section on the VM details page in the
Cloud Console

    9. Repeat the preceding steps for the secondary server.

    gcloud

    1. In the Cloud Shell or wherever you have the Cloud SDK installed, use the following gcloud command with the included startup script to add a startup script to the instance metadata for each VM. Replace the variables with the name and zone of the VM before entering the command.

      gcloud compute instances add-metadata primary-vm-name \
      --zone=primary-vm-zone --metadata=startup-script='#! /bin/bash
      # VM startup script
      
      nic0_mac="$(curl -H "Metadata-Flavor:Google" \
      --connect-timeout 5 --retry 5 --retry-max-time 60 \
      http://169.254.169.254/computeMetadata/v1/instance/network-interfaces/0/mac)"
      
      nic0_ip="$(curl -H "Metadata-Flavor:Google" \
      --connect-timeout 5 --retry 5 --retry-max-time 60 \
      http://169.254.169.254/computeMetadata/v1/instance/network-interfaces/0/ip)"
      
      for nic in $(ls /sys/class/net); do
      nic_addr=$(cat /sys/class/net/"${nic}"/address)
      if [ "$nic_addr" == "$nic0_mac" ]; then
        nic0_name="$nic"
        break
      fi
      done
      
      [[ -n $nic0_name ]] && [[ -n $nic0_ip ]] \
      && logger -i "gce-startup-script: INFO adding IP configuration for ILB client" \
      || logger -i "gce-startup-script: ERROR could not determine IP or interface name"
      
      if [ -n "$nic0_name" ]; then
      ip rule add pref 0 from all iif "${nic0_name}" lookup local
      ip rule del from all lookup local
      ip route add local "${nic0_ip}" dev "${nic0_name}" proto kernel \
        scope host src "${nic0_ip}" table main
      ip route add local 127.0.0.0/8 dev lo proto kernel \
        scope host src 127.0.0.1 table main
      ip route add local 127.0.0.1 dev lo proto kernel \
        scope host src 127.0.0.1 table main
      ip route add broadcast 127.0.0.0 dev lo proto kernel \
        scope link src 127.0.0.1 table main
      ip route add broadcast 127.255.255.255 dev lo proto kernel \
        scope link src 127.0.0.1 table main
      fi'
    2. Reboot the server for the startup script to take effect.

    3. To confirm that the startup script is stored in the instance metadata, issue the following command:

      gcloud compute instances describe primary-vm-name \
      --zone=primary-vm-zone

      The startup script appears in the output under metadata, as shown in the following truncated example:

      metadata:
      fingerprint: Tbuij9k-knk=
      items:
      - key: post_deployment_script
      value: ''
      - key: sap_deployment_debug
      value: 'False'
      - key: status
      value: completed
      - key: startup-script
      value: |-
        #! /bin/bash
        # VM startup script
        ...
        [example truncated]
    4. For the secondary VM, repeat the preceding steps after replacing the variable values with the values for the secondary VM instance.

For more information about creating startup scripts for Compute Engine VMs, see Running startup scripts.

Configure SSH keys on the primary and secondary VMs

To allow files to be copied between the hosts in the HA cluster, the steps in this section create root SSH connections between the two hosts.

The Deployment Manager templates that Google Cloud provides generate a key for you, but you can replace it with a key you generate if needed.

Your organization is likely to have guidelines that govern internal network communications. If necessary, after deployment is complete you can remove the metadata from the VMs and the keys from the authorized_keys directory.

If setting up direct SSH connections does not comply with your organization's guidelines, you can transfer files by using other methods, such as:

  • Transfer smaller files through your local workstation by using the Cloud Shell Upload file and Download file menu options. See Managing files with Cloud Shell.
  • Exchange files using a Google Cloud Storage bucket. See Working with objects in the Cloud Storage documentation.
  • Use a file storage solution like Filestore or NetApp Cloud Volumes Service to create a shared folder. See File sharing solutions.

To enable SSH connections between the primary and secondary instances, follow these steps. The steps assume that you are using the ssh key that is generated by the Deployment Manager templates for SAP.

  1. On the primary host VM:

    1. Connect to the VM via SSH.

    2. Switch to root:

      $ sudo su -
    3. Confirm that the ssh key exists:

      # ls -l /root/.ssh/

      You should see the id_rsa key files as in the following example:

      -rw-r--r-- 1 root root  569 May  4 23:07 authorized_keys
      -rw------- 1 root root 2459 May  4 23:07 id_rsa
      -rw-r--r-- 1 root root  569 May  4 23:07 id_rsa.pub
    4. Update the primary VM's metadata with information about the SSH key for the secondary VM.

      # gcloud compute instances add-metadata secondary-vm-name \
      --metadata "ssh-keys=$(whoami):$(cat ~/.ssh/id_rsa.pub)" --zone secondary-vm-zone
    5. Confirm that the SSH keys are set up properly by opening an SSH connection from the primary system to the secondary system.

      # ssh secondary-vm-name
  2. On the secondary host VM:

    1. SSH into the VM.

    2. Switch to root:

      $ sudo su -
    3. Confirm that the ssh key exists:

      # ls -l /root/.ssh/

      You should see the id_rsa key files as in the following example:

      -rw-r--r-- 1 root root  569 May  4 23:07 authorized_keys
      -rw------- 1 root root 2459 May  4 23:07 id_rsa
      -rw-r--r-- 1 root root  569 May  4 23:07 id_rsa.pub
    4. Update the secondary VM's metadata with information about the SSH key for the primary VM.

      # gcloud compute instances add-metadata primary-vm-name \
      --metadata "ssh-keys=$(whoami):$(cat ~/.ssh/id_rsa.pub)" --zone primary-zone
      # cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
    5. Confirm that the SSH keys are set up properly by opening an SSH connection from the secondary system to the primary system.

      # ssh primary-vm-name

Set up shared file storage and configure the shared directories

You need to set up an NFS file sharing solution that provides highly-available shared file storage that both nodes of your HA cluster can access. You then create directories on both nodes that map to the shared file storage. The cluster software ensures that the appropriate directories mounted only on the correct instances.

Setting up a file sharing solution is not covered in this guide. For instructions on setting up the file sharing system, see the instructions provided by the vendor of the solution you select.

For information about file sharing solutions that are available on Google Cloud, see Shared storage options for HA SAP systems on Google Cloud.

To configure the shared directories:

  1. If you did not already set up a highly available NFS shared file storage solution, do so now.

  2. Mount the NFS shared storage on both servers for initial configuration.

    ~> sudo mkdir /mnt/nfs
    ~> sudo mount -t nfs nfs-path /mnt/nfs
  3. From either server, create directories for sapmnt, the central transport directory, the system directory, and the instance-specific directory. If you are using a Java stack, replace "ASCS" with "SCS" before you use the following and any other example commands:

    ~> sudo mkdir /mnt/nfs/sapmntSID
    ~> sudo mkdir /mnt/nfs/usrsap{trans,SIDSYS,SIDASCSscs-instance-number,SIDERSers-instance-number}
  4. On both servers, create the necessary mount points:

    ~> sudo mkdir -p /sapmnt/SID
    ~> sudo mkdir -p /usr/sap/trans
    ~> sudo mkdir -p /usr/sap/SID/SYS
    ~> sudo mkdir -p /usr/sap/SID/ASCSscs-instance-number
    ~> sudo mkdir -p /usr/sap/SID/ERSers-instance-number
  5. Configure autofs to mount the common shared file directories when the file directories are first accessed. The mounting of the ASCSscs-instance-number and ERSers-instance-number directories is managed by the cluster software, which you configure in a later step.

    Adjust the NFS options in the commands as needed for your for your file-sharing solution.

    On both servers, configure autofs:

    ~> echo "/- /etc/auto.sap" | sudo tee -a /etc/auto.master
    ~> NFS_OPTS="-rw,relatime,vers=3,hard,proto=tcp,timeo=600,retrans=2,mountvers=3,mountport=2050,mountproto=tcp"
    ~> echo "/sapmnt/SID ${NFS_OPTS} nfs-path/sapmntSID" | sudo tee -a /etc/auto.sap
    ~> echo "/usr/sap/trans ${NFS_OPTS} nfs-path/usrsaptrans" | sudo tee -a /etc/auto.sap
    ~> echo "/usr/sap/SID/SYS ${NFS_OPTS} nfs-path/usrsapSIDSYS" | sudo tee -a /etc/auto.sap

    For more information about autofs, see autofs - how it works.

  6. On both servers, start the autofs service:

    ~> sudo systemctl enable autofs
    ~> sudo systemctl restart autofs
    ~> sudo automount -v
  7. Trigger autofs to mount shared directories by accessing each directory by using the cd command. For example:

    ~> cd /sapmnt/SID
    ~> cd /usr/sap/trans
    ~> cd /usr/sap/SID/SYS
    
  8. After you access all the directories, issue the df -Th command to confirm the directories are mounted.

    ~> df -Th | grep file_share_name

    You should see mount points and directories similar to the following:

    10.49.153.26:/nfs_share_nw_ha              nfs      1007G   76M  956G   1% /mnt/nfs
    10.49.153.26:/nfs_share_nw_ha/usrsapAHASYS nfs      1007G   76M  956G   1% /usr/sap/AHA/SYS
    10.49.153.26:/nfs_share_nw_ha/usrsaptrans  nfs      1007G   76M  956G   1% /usr/sap/trans
    10.49.153.26:/nfs_share_nw_ha/sapmntAHA    nfs      1007G   76M  956G   1% /sapmnt/AHA

Configure the Cloud Load Balancing failover support

The Internal TCP/UDP Load Balancing service with failover support routes the SCS and ERS traffic to the active instances of each in an SAP NetWeaver cluster. The Internal TCP/UDP Load Balancing uses virtual IP (VIP) addresses, backend services, instance groups, and health checks to route the traffic appropriately.

Reserve IP addresses for the virtual IPs

For an SAP NetWeaver high-availability cluster, you create two VIPs, which are sometimes referred to as floating IP addresses. One VIP follows the active SAP Central Services (SCS) instance and the other follows the Enqueue Replication Server (ERS) instance. The load balancer routes traffic that is sent to each VIP to the VM that is currently hosting the active instance of the SCS or ERS component of the VIP.

  1. Open Cloud Shell:

    Go to Cloud Shell

  2. Reserve an IP address for the virtual IP of the SCS and for the VIP of the ERS. For SCS, the IP address is the IP address that applications use to access SAP NetWeaver. For ERS, the IP address is the IP address that is used for Enqueue Server replication. If you omit the --addresses flag, an IP address in the specified subnet is chosen for you:

    ~ gcloud compute addresses create scs-vip-name \
      --region cluster-region --subnet cluster-subnet \
      --addresses scs-vip-address
    
    ~ gcloud compute addresses create ers-vip-name \
      --region cluster-region --subnet cluster-subnet \
      --addresses ers-vip-address

    For more information about reserving a static IP, see Reserving a static internal IP address.

  3. Confirm IP address reservation:

    ~ gcloud compute addresses describe vip-name \
      --region cluster-region

    You should see output similar to the following example:

    address: 10.1.0.85
    addressType: INTERNAL
    creationTimestamp: '2021-05-12T13:30:29.991-07:00'
    description: ''
    id: '1740813556077659146'
    kind: compute#address
    name: scs-aha-vip-name
    networkTier: PREMIUM
    purpose: GCE_ENDPOINT
    region: https://www.googleapis.com/compute/v1/projects/example-project-123456/regions/us-central1
    selfLink: https://www.googleapis.com/compute/v1/projects/example-project-123456/regions/us-central1/addresses/scs-aha-vip-name
    status: RESERVED
    subnetwork: https://www.googleapis.com/compute/v1/projects/example-project-123456/regions/us-central1/subnetworks/example-sub-network-sap

Define host names for the VIP address in /etc/hosts

Define a host name for each VIP address and then add the IP addresses and host names for both the VMs and the VIPs to the /etc/hosts file on each VM.

The VIP host names are not known outside of the VMs unless you also add them to your DNS service. Adding these entries to the local /etc/hosts file protects your cluster from any disruptions to your DNS service.

Your updates to the /etc/hosts file should look similar to the following example:

#
# IP-Address  Full-Qualified-Hostname  Short-Hostname
#
127.0.0.1       localhost
10.1.0.89       nw-ha-vm-1
10.1.0.88       nw-ha-vm-2
10.1.0.90       vh-scs-aha
10.1.0.91       vh-ers-aha

Create the Cloud Load Balancing health checks

Create health checks: one for the active SCS instance and one for the active ERS.

  1. In Cloud Shell, create the health checks. To avoid clashing with other services, designate port numbers for the SCS and ERS instances in the private range, 49152-65535. The check-interval and timeout values in the following commands are slightly longer than the defaults so as to increase failover tolerance during Compute Engine live migration events. You can adjust the values, if necessary:

    1. ~ gcloud compute health-checks create tcp scs-health-check-name \
      --port=scs-healthcheck-port-num --proxy-header=NONE --check-interval=10 --timeout=10 \
      --unhealthy-threshold=2 --healthy-threshold=2
    2. ~ gcloud compute health-checks create tcp ers-health-check-name \
      --port=ers-healthcheck-port-num --proxy-header=NONE --check-interval=10 --timeout=10 \
      --unhealthy-threshold=2 --healthy-threshold=2
  2. Confirm the creation of each health check:

    ~ gcloud compute health-checks describe health-check-name

    You should see output similar to the following example:

    checkIntervalSec: 10
    creationTimestamp: '2021-05-12T15:12:21.892-07:00'
    healthyThreshold: 2
    id: '1981070199800065066'
    kind: compute#healthCheck
    name: scs-aha-health-check-name
    selfLink: https://www.googleapis.com/compute/v1/projects/example-project-123456/global/healthChecks/scs-aha-health-check-name
    tcpHealthCheck:
      port: 60000
      portSpecification: USE_FIXED_PORT
      proxyHeader: NONE
    timeoutSec: 10
    type: TCP
    unhealthyThreshold: 2

Create a firewall rule for the health checks

If you haven't done so already, define a firewall rule for a port in the private range that allows access to your host VMs from the IP ranges that are used by Cloud Load Balancing health checks, 35.191.0.0/16 and 130.211.0.0/22. For more information about firewall rules for load balancers, see Creating firewall rules for health checks.

  1. If you don't already have one, add a network tag to your host VMs. This network tag is used by the firewall rule for health checks.

  2. Create a firewall rule that uses the network tag to allow the health checks:

    ~ gcloud compute firewall-rules create  rule-name \
      --network=network-name \
      --action=ALLOW \
      --direction=INGRESS \
      --source-ranges=35.191.0.0/16,130.211.0.0/22 \
      --target-tags=network-tags \
      --rules=tcp:scs-healthcheck-port-num,tcp:ers-healthcheck-port-num

    For example:

    gcloud compute firewall-rules create  nw-ha-cluster-health-checks \
    --network=example-network \
    --action=ALLOW \
    --direction=INGRESS \
    --source-ranges=35.191.0.0/16,130.211.0.0/22 \
    --target-tags=allow-health-check \
    --rules=tcp:60000,tcp:60010

Create Compute Engine instance groups

You need to create an instance group in each zone that contains a cluster-node VM and add the VM in that zone to the instance group.

  1. In Cloud Shell, create the primary instance group and add the primary VM to it:

    1. ~ gcloud compute instance-groups unmanaged create primary-ig-name \
      --zone=primary-vm-zone
    2. ~ gcloud compute instance-groups unmanaged add-instances primary-ig-name \
      --zone=primary-vm-zone \
      --instances=primary-vm-name
  2. In Cloud Shell, create the secondary instance group and add the secondary VM to it:

    1. ~ gcloud compute instance-groups unmanaged create secondary-ig-name \
      --zone=secondary-vm-zone
    2. ~ gcloud compute instance-groups unmanaged add-instances secondary-ig-name \
      --zone=secondary-vm-zone \
      --instances=secondary-vm-name
  3. Confirm the creation of the instance groups:

    ~ gcloud compute instance-groups unmanaged list

    You should see output similar to the following example:

    NAME                              ZONE           NETWORK              NETWORK_PROJECT        MANAGED  INSTANCES
    sap-aha-primary-instance-group    us-central1-b  example-network-sap  example-project-123456  No       1
    sap-aha-secondary-instance-group  us-central1-c  example-network-sap  example-project-123456  No       1
    

Configure the backend services

Create two backend services, one for SCS and one for ERS. Add both instance groups to each backend service, designating the opposite instance group as the failover instance group in each backend service. Finally, create forwarding rules from the VIPs to the backend services.

  1. In Cloud Shell, create the backend service and failover group for SCS:

    1. Create the backend service for SCS:

      ~ gcloud compute backend-services create scs-backend-service-name \
         --load-balancing-scheme internal \
         --health-checks scs-health-check-name \
         --no-connection-drain-on-failover \
         --drop-traffic-if-unhealthy \
         --failover-ratio 1.0 \
         --region cluster-region \
         --global-health-checks
    2. Add the primary instance group to the SCS backend service:

      ~ gcloud compute backend-services add-backend scs-backend-service-name \
        --instance-group primary-ig-name \
        --instance-group-zone primary-vm-zone \
        --region cluster-region
    3. Add the secondary instance group as the failover instance group for the SCS backend service:

      ~ gcloud compute backend-services add-backend scs-backend-service-name \
        --instance-group secondary-ig-name \
        --instance-group-zone secondary-vm-zone \
        --failover \
        --region cluster-region
  2. In Cloud Shell, create the backend service and failover group for ERS:

    1. Create the backend service for ERS:

      ~ gcloud compute backend-services create ers-backend-service-name \
      --load-balancing-scheme internal \
      --health-checks ers-health-check-name \
      --no-connection-drain-on-failover \
      --drop-traffic-if-unhealthy \
      --failover-ratio 1.0 \
      --region cluster-region \
      --global-health-checks
    2. Add the secondary instance group to the ERS backend service:

      ~ gcloud compute backend-services add-backend ers-backend-service-name \
        --instance-group secondary-ig-name \
        --instance-group-zone secondary-vm-zone \
        --region cluster-region
    3. Add the primary instance group as the failover instance group for the ERS backend service:

      ~ gcloud compute backend-services add-backend ers-backend-service-name \
        --instance-group primary-ig-name \
        --instance-group-zone primary-vm-zone \
        --failover \
        --region cluster-region
  3. Optionally, confirm that the backend services contain the instance groups as expected:

    ~ gcloud compute backend-services describe backend-service-name \
     --region=cluster-region

    You should see output similar to the following example for the SCS backend service. For ERS, failover: true would appear on the primary instance group:

    backends:
    - balancingMode: CONNECTION
      group: https://www.googleapis.com/compute/v1/projects/example-project-123456/zones/us-central1-b/instanceGroups/sap-aha-primary-instance-group
    - balancingMode: CONNECTION
      failover: true
      group: https://www.googleapis.com/compute/v1/projects/example-project-123456/zones/us-central1-c/instanceGroups/sap-aha-secondary-instance-group
    connectionDraining:
      drainingTimeoutSec: 0
    creationTimestamp: '2021-05-25T08:30:58.424-07:00'
    description: ''
    failoverPolicy:
      disableConnectionDrainOnFailover: true
      dropTrafficIfUnhealthy: true
      failoverRatio: 1.0
    fingerprint: n44gVc1VVQE=
    healthChecks:
    - https://www.googleapis.com/compute/v1/projects/example-project-123456/global/healthChecks/scs-aha-health-check-name
    id: '4940777952116778717'
    kind: compute#backendService
    loadBalancingScheme: INTERNAL
    name: scs-aha-backend-service-name
    protocol: TCP
    region: https://www.googleapis.com/compute/v1/projects/example-project-123456/regions/us-central1
    selfLink: https://www.googleapis.com/compute/v1/projects/example-project-123456/regions/us-central1/backendServices/scs-aha-backend-service-name
    sessionAffinity: NONE
    timeoutSec: 30
  4. In Cloud Shell, create forwarding rules for the SCS and ERS backend services:

    1. Create the forwarding rule from the SCS VIP to the SCS backend service:

      ~ gcloud compute forwarding-rules create scs-forwarding-rule-name \
      --load-balancing-scheme internal \
      --address scs-vip-address \
      --subnet cluster-subnet \
      --region cluster-region \
      --backend-service scs-backend-service-name \
      --ports ALL
    2. Create the forwarding rule from the ERS VIP to the ERS backend service:

      ~ gcloud compute forwarding-rules create ers-forwarding-rule-name \
      --load-balancing-scheme internal \
      --address ers-vip-address \
      --subnet cluster-subnet \
      --region cluster-region \
      --backend-service ers-backend-service-name \
      --ports ALL

Test the load balancer configuration

Even though your backend instance groups won't register as healthy until later, you can test the load balancer configuration by setting up a listener to respond to the health checks. After setting up a listener, if the load balancer is configured correctly, the status of the backend instance groups changes to healthy.

The following sections present different methods that you can use to test the configuration.

Testing the load balancer with the socat utility

You can use the socat utility to temporarily listen on a health check port. You need to install the socatutility anyway, because you use it later when you configure cluster resources.

  1. On both host VMs as root, install the socat utility:

    # zypper install -y socat

  2. On the primary VM, start a socat process to listen for 60 seconds on the SCS health check port:

    # timeout 60s socat - TCP-LISTEN:scs-healthcheck-port-num,fork

  3. In Cloud Shell, after waiting a few seconds for the health check to detect the listener, check the health of your SCS backend instance group:

    ~ gcloud compute backend-services get-health scs-backend-service-name \
      --region cluster-region
  4. Repeat the steps for ERS, replacing the SCS variable values with the ERS values.

    You should see output similar to the following example for SCS:

    backend: https://www.googleapis.com/compute/v1/projects/example-project-123456/zones/us-central1-b/instanceGroups/sap-aha-primary-instance-group
    status:
      healthStatus:
      - forwardingRule: https://www.googleapis.com/compute/v1/projects/example-project-123456/regions/us-central1/forwardingRules/scs-aha-forwarding-rule
        forwardingRuleIp: 10.1.0.90
        healthState: HEALTHY
        instance: https://www.googleapis.com/compute/v1/projects/example-project-123456/zones/us-central1-b/instances/nw-ha-vm-1
        ipAddress: 10.1.0.89
        port: 80
      kind: compute#backendServiceGroupHealth
    ---
    backend: https://www.googleapis.com/compute/v1/projects/example-project-123456/zones/us-central1-c/instanceGroups/sap-aha-secondary-instance-group
    status:
      healthStatus:
      - forwardingRule: https://www.googleapis.com/compute/v1/projects/example-project-123456/regions/us-central1/forwardingRules/scs-aha-forwarding-rule
        forwardingRuleIp: 10.1.0.90
        healthState: UNHEALTHY
        instance: https://www.googleapis.com/compute/v1/projects/example-project-123456/zones/us-central1-c/instances/nw-ha-vm-2
        ipAddress: 10.1.0.88
        port: 80
      kind: compute#backendServiceGroupHealth

Testing the load balancer using port 22

If port 22 is open for SSH connections on your host VMs, you can temporarily edit the health checker to use port 22, which has a listener that can respond to the health checker.

To temporarily use port 22, follow these steps:

  1. Click your health check in the console:

    Go to Health checks page

  2. Click Edit.

  3. In the Port field, change the port number to 22.

  4. Click Save and wait a minute or two.

  5. In Cloud Shell,, after waiting a few seconds for the health check to detect the listener, check the health of your backend instance groups:

    ~ gcloud compute backend-services get-health backend-service-name \
      --region cluster-region

    You should see output similar to the following:

    backend: https://www.googleapis.com/compute/v1/projects/example-project-123456/zones/us-central1-b/instanceGroups/sap-aha-primary-instance-group
    status:
      healthStatus:
      - forwardingRule: https://www.googleapis.com/compute/v1/projects/example-project-123456/regions/us-central1/forwardingRules/scs-aha-forwarding-rule
        forwardingRuleIp: 10.1.0.85
        healthState: HEALTHY
        instance: https://www.googleapis.com/compute/v1/projects/example-project-123456/zones/us-central1-b/instances/nw-ha-vm-1
        ipAddress: 10.1.0.79
        port: 80
      kind: compute#backendServiceGroupHealth
    ---
    backend: https://www.googleapis.com/compute/v1/projects/example-project-123456/zones/us-central1-c/instanceGroups/sap-aha-secondary-instance-group
    status:
      healthStatus:
      - forwardingRule: https://www.googleapis.com/compute/v1/projects/example-project-123456/regions/us-central1/forwardingRules/scs-aha-forwarding-rule
        forwardingRuleIp: 10.1.0.85
        healthState: HEALTHY
        instance: https://www.googleapis.com/compute/v1/projects/example-project-123456/zones/us-central1-c/instances/nw-ha-vm-2
        ipAddress: 10.1.0.78
        port: 80
      kind: compute#backendServiceGroupHealth
  6. When you are done, change the health check port number back to the original port number.

Set up Pacemaker

The following procedure configures the SUSE implementation of a Pacemaker cluster on Compute Engine VMs for SAP NetWeaver.

For more information about the configuring high-availability clusters on SLES, see the SUSE Linux Enterprise High Availability Extension documentation for your version of SLES.

Install the required cluster packages

  1. As root on both the primary and secondary hosts, download the following required cluster packages:

    • The ha_sles pattern:

      # zypper install -t pattern ha_sles
    • The sap-suse-cluster-connector package:

      # zypper install -y sap-suse-cluster-connector
    • If you didn't already install it, the socat utility:

      # zypper install -y socat

  2. Confirm that the latest high-availability agents are loaded:

    # zypper se -t patch SUSE-SLE-HA

Initialize, configure, and start the cluster on the primary VM

You initialize the cluster by using the ha-cluster-init SUSE script. You then need to edit the Corosync configuration file and sync it with the secondary node. After starting the cluster, you then set additional cluster properties and defaults by using crm commands.

Initialize the cluster

To initialize the cluster:

  1. On the primary host as root, initialize the cluster by using the SUSE ha-cluster-init script. The following commands name the cluster and create the configuration file corosync.conf:configure it, and set up synching between the cluster nodes.

    # ha-cluster-init --name cluster-name --yes --interface eth0 csync2
    # ha-cluster-init --name cluster-name --yes --interface eth0 corosync

Update the Corosync configuration files

  1. Open the corosync.conffile for editing:

    # vi /etc/corosync/corosync.conf
  2. In the totem section of the corosync.conf file, set the parameters in the following excerpted example to the values that are shown. Some parameters might already be set to the correct values:

    totem {
            ...
            token: 20000
            token_retransmits_before_loss_const: 10
            join: 60
            max_messages: 20
            ...
    }
  3. Start the cluster:

    # ha-cluster-init --name cluster-name cluster

Set the additional cluster properties

  1. Set the general cluster properties:

    # crm configure property stonith-timeout="300s"
    # crm configure property stonith-enabled="true"
    # crm configure rsc_defaults resource-stickiness="1"
    # crm configure rsc_defaults migration-threshold="3"
    # crm configure op_defaults timeout="600"

    When you define the individual cluster resources, the values that you set for resource-stickiness and migration-threshold override the default values that you set here.

    You can see the resource defaults, as well as the values for any defined resources, by entering crm config show.

  2. Start Pacemaker on the primary host:

    # systemctl enable pacemaker
    # systemctl start pacemaker

Join the secondary VM to the cluster

From the open terminal on the primary VM, join and start the cluster on the secondary VM via SSH.

  1. From the primary VM, run the following ha-cluster-join script options on the secondary VM via SSH. If you have configured your HA cluster as described by these instructions, you can disregard the warnings about the watchdog device.

    1. Run the --interface eth0 csync2 option:

      # ssh secondary-vm-name 'ha-cluster-join --cluster-node primary-vm-name --yes --interface eth0 csync2'
    2. Run the ssh_merge option:

      # ssh secondary-vm-name 'ha-cluster-join --cluster-node primary-vm-name --yes ssh_merge'
    3. Run the cluster option:

      # ssh secondary-vm-name 'ha-cluster-join --cluster-node primary-vm-name --yes cluster'
  2. Start Pacemaker on the secondary host:

    1. Enable Pacemaker:

      # ssh secondary-vm-name systemctl enable pacemaker
    2. Start Pacemaker:

      # ssh secondary-vm-name systemctl start pacemaker
  3. On either host as root, confirm that the cluster shows both nodes:

    # crm_mon -s

    You should see output similar to the following:

    CLUSTER OK: 2 nodes online, 0 resource instances configured

Configure the cluster resources for the infrastructure

You define the resources that Pacemaker manages in a high-availability cluster. You need to define resources for the following cluster components:

  • The fencing device, which prevents split brain scenarios
  • The SCS and ERS directories in the shared file system
  • The health checks
  • The VIPs
  • The SCS and ERS components

You define the resources for SCS and ERS components later than the rest of the resources, because you need to install SAP NetWeaver first.

Enable maintenance mode

  1. On either host as root, put the cluster in maintenance mode:

    # crm configure property maintenance-mode="true"
  2. Confirm maintenance mode:

    # crm status

    The output should indicate that resource management is disabled, as shown in the following example:

    Cluster Summary:
    * Stack: corosync
    * Current DC: nw-ha-vm-1 (version 2.0.4+20200616.2deceaa3a-3.3.1-2.0.4+20200616.2deceaa3a) - partition with quorum
    * Last updated: Fri May 14 15:26:08 2021
    * Last change:  Thu May 13 19:02:33 2021 by root via cibadmin on nw-ha-vm-1
    * 2 nodes configured
    * 0 resource instances configured
    
                *** Resource management is DISABLED ***
    The cluster will not attempt to start, stop or recover services
    
    Node List:
    * Online: [ nw-ha-vm-1 nw-ha-vm-2 ]
    
    Full List of Resources:
    * No resources

Set up fencing

You set up fencing by defining a cluster resource with the fence_gce agent for each host VM.

To ensure the correct sequence of events after a fencing action, you also configure the operating system to delay the restart of Corosync after a VM is fenced. You also adjust the Pacemaker timeout for reboots to account for the delay.

Create the fencing device resources

For each VM in the cluster, you create a cluster resource for the fencing device that can restart that VM. The fencing device for a VM must run on a different VM, so you configure the location of the cluster resource to run on any VM except the VM it can restart.

  1. On the primary host as root, create a cluster resource for a fencing device for the primary VM:

    # crm configure primitive fencing-rsc-name-primary-vm stonith:fence_gce \
      op monitor interval="300s" timeout="120s" \
      op start interval="0" timeout="60s" \
      params port="primary-vm-name" zone="primary-vm-zone" \
      project="cluster-project-id" \
      pcmk_reboot_timeout=300 pcmk_monitor_retries=4 pcmk_delay_max=30
  2. Configure the location of the fencing device for the primary VM so that it is active on only the secondary VM:

    # crm configure location fencing-location-name-primary-vm \
      fencing-rsc-name-primary-vm -inf: "primary-vm-name"
  3. On the primary host as root, create a cluster resource for a fencing device for the secondary VM:

    # crm configure primitive fencing-rsc-name-secondary-vm stonith:fence_gce \
      op monitor interval="300s" timeout="120s" \
      op start interval="0" timeout="60s" \
      params port="secondary-vm-name" zone="secondary-vm-zone" \
      project="cluster-project-id" \
      pcmk_reboot_timeout=300 pcmk_monitor_retries=4
  4. Configure the location of the fencing device for the secondary VM so that it is active on only the primary VM:

    # crm configure location fencing-location-name-secondary-vm \
      fencing-rsc-name-secondary-vm -inf: "secondary-vm-name"

Set a delay for the restart of Corosync

  1. On both hosts as root, create a systemd drop-in file that delays the startup of Corosync to ensure the proper sequence of events after a fenced VM is rebooted:

    systemctl edit corosync.service
  2. Add the following lines to the file:

    [Service]
    ExecStartPre=/bin/sleep 60
  3. Save the file and exit the editor.

  4. Reload the systemd manager configuration.

    systemctl daemon-reload
  5. Confirm the drop-in file was created:

    service corosync status

    You should see a line for the drop-in file, as shown in the following example:

    ● corosync.service - Corosync Cluster Engine
       Loaded: loaded (/usr/lib/systemd/system/corosync.service; disabled; vendor preset: disabled)
      Drop-In: /etc/systemd/system/corosync.service.d
               └─override.conf
       Active: active (running) since Tue 2021-07-20 23:45:52 UTC; 2 days ago

Create the file system resources

Now that you have created the shared file system directories, you can define the cluster resources.

  1. Configure the file system resources for the instance specific directories.

    # crm configure primitive scs-file-system-rsc-name Filesystem \
    device="nfs-path/usrsapSIDASCSscs-instance-number" \
    directory="/usr/sap/SID/ASCSscs-instance-number" fstype="nfs" \
    op start timeout=60s interval=0 \
    op stop timeout=60s interval=0 \
    op monitor interval=20s timeout=40s
    # crm configure primitive ers-file-system-rsc-name Filesystem \
    device="nfs-path/usrsapSIDERSers-instance-number" \
    directory="/usr/sap/SID/ERSers-instance-number" fstype="nfs" \
    op start timeout=60s interval=0 \
    op stop timeout=60s interval=0 \
    op monitor interval=20s timeout=40s

Create the health check resources

  1. Configure the cluster resources for the SCS and ERS health checks:
# crm configure primitive scs-health-check-rsc-name anything \
  params binfile="/usr/bin/socat" \
  cmdline_options="-U TCP-LISTEN:scs-healthcheck-port-num,backlog=10,fork,reuseaddr /dev/null" \
  op monitor timeout=20s interval=10s \
  op_params depth=0
# crm configure primitive ers-health-check-rsc-name anything \
  params binfile="/usr/bin/socat" \
  cmdline_options="-U TCP-LISTEN:ers-healthcheck-port-num,backlog=10,fork,reuseaddr /dev/null" \
  op monitor timeout=20s interval=10s \
  op_params depth=0

Create the VIP resources

Define the cluster resources for the VIP addresses.

  1. If you need to look up the numerical VIP address, you can use:

    • gcloud compute addresses describe scs-vip-name
      --region=cluster-region --format="value(address)"
    • gcloud compute addresses describe ers-vip-name
      --region=cluster-region --format="value(address)"
  2. Create the cluster resources for the SCS and ERS VIPs.

    # crm configure primitive scs-vip-rsc-name IPaddr2 \
     params ip=scs-vip-address cidr_netmask=32 nic="eth0" \
     op monitor interval=3600s timeout=60s
    # crm configure primitive ers-vip-rsc-name IPaddr2 \
     params ip=ers-vip-address cidr_netmask=32 nic="eth0" \
     op monitor interval=3600s timeout=60s

View the defined resources

  1. To see all of the resources that you have defined so far, enter the following command:

    # crm status

    You should see output similar to the following example:

    Stack: corosync
    Current DC: nw-ha-vm-1 (version 1.1.24+20201209.8f22be2ae-3.12.1-1.1.24+20201209.8f22be2ae) - partition with quorum
    Last updated: Wed May 26 19:10:10 2021
    Last change: Tue May 25 23:48:35 2021 by root via cibadmin on nw-ha-vm-1
    
    2 nodes configured
    8 resource instances configured
    
                  *** Resource management is DISABLED ***
      The cluster will not attempt to start, stop or recover services
    
    Online: [ nw-ha-vm-1 nw-ha-vm-2 ]
    
    Full list of resources:
    
     fencing-rsc-nw-aha-vm-1        (stonith:fence_gce):    Stopped (unmanaged)
     fencing-rsc-nw-aha-vm-2        (stonith:fence_gce):    Stopped (unmanaged)
     filesystem-rsc-nw-aha-scs      (ocf::heartbeat:Filesystem):    Stopped (unmanaged)
     filesystem-rsc-nw-aha-ers      (ocf::heartbeat:Filesystem):    Stopped (unmanaged)
     health-check-rsc-nw-ha-scs     (ocf::heartbeat:anything):      Stopped (unmanaged)
     health-check-rsc-nw-ha-ers     (ocf::heartbeat:anything):      Stopped (unmanaged)
     vip-rsc-nw-aha-scs     (ocf::heartbeat:IPaddr2):       Stopped (unmanaged)
     vip-rsc-nw-aha-ers     (ocf::heartbeat:IPaddr2):       Stopped (unmanaged)

Install SCS and ERS

The following section covers only the requirements and recommendations that are specific to installing SAP NetWeaver on Google Cloud.

For complete installation instructions, see the SAP NetWeaver documentation.

Prepare for installation

To ensure consistency across the cluster and simplify installation, before you install the SAP NetWeaver SCS and ERS components, define the users, groups, and permissions and put the secondary server in standby mode.

  1. Take the cluster out of maintenance mode:

    # crm configure property maintenance-mode="false"
  2. On both servers as root, enter the following commands, specifying the user and group IDs that are appropriate for your environment:

    # groupadd -g gid-sapinst sapinst
    # groupadd -g gid-sapsys sapsys
    # useradd -u uid-sidadm sid-loweradm -g sapsys
    # usermod -a -G sapinst sid-loweradm
    # useradd -u uid-sapadm sapadm -g sapinst
    
    # chown sid-loweradm:sapsys /usr/sap/SID/SYS
    # chown sid-loweradm:sapsys /sapmnt/SID -R
    # chown sid-loweradm:sapsys /usr/sap/trans -R
    # chown sid-loweradm:sapsys /usr/sap/SID/SYS -R
    # chown sid-loweradm:sapsys /usr/sap/SID -R

Install the SCS component

  1. On the secondary server, enter the following command to put the secondary server in standby mode:

    # crm_standby -v on -N ${HOSTNAME};

    Putting the secondary server in standby mode consolidates all of the cluster resources on the primary server, which simplifies installation.

  2. Confirm that the secondary server is in standby mode:

    # crm status

    You should see output similar to the following example:

    Stack: corosync
    Current DC: nw-ha-vm-1 (version 1.1.24+20201209.8f22be2ae-3.12.1-1.1.24+20201209.8f22be2ae) - partition with quorum
    Last updated: Thu May 27 17:45:16 2021
    Last change: Thu May 27 17:45:09 2021 by root via crm_attribute on nw-ha-vm-2
    
    2 nodes configured
    8 resource instances configured
    
    Node nw-ha-vm-2: standby
    Online: [ nw-ha-vm-1 ]
    
    Full list of resources:
    
     fencing-rsc-nw-aha-vm-1        (stonith:fence_gce):    Stopped
     fencing-rsc-nw-aha-vm-2        (stonith:fence_gce):    Started nw-ha-vm-1
     filesystem-rsc-nw-aha-scs      (ocf::heartbeat:Filesystem):    Started nw-ha-vm-1
     filesystem-rsc-nw-aha-ers      (ocf::heartbeat:Filesystem):    Started nw-ha-vm-1
     health-check-rsc-nw-ha-scs     (ocf::heartbeat:anything):      Started nw-ha-vm-1
     health-check-rsc-nw-ha-ers     (ocf::heartbeat:anything):      Started nw-ha-vm-1
     vip-rsc-nw-aha-scs     (ocf::heartbeat:IPaddr2):       Started nw-ha-vm-1
     vip-rsc-nw-aha-ers     (ocf::heartbeat:IPaddr2):       Started nw-ha-vm-1
  3. On the primary server as the root user, change your directory to a temporary installation directory, such as /tmp, to install the SCS instance by running the SAP Software Provisioning Manager (SWPM).

    • To access the web interface of SWPM, you need the password for the root user. If your IT policy does not allow the SAP administrator to have access to the root password, you can use the SAPINST_REMOTE_ACCESS_USER.

    • When you start SWPM, use the SAPINST_USE_HOSTNAME parameter to specify the virtual host name that you defined for the SCS VIP address in the /etc/hosts file.

      For example:

      cd /tmp; /mnt/nfs/install/SWPM/sapinst SAPINST_USE_HOSTNAME=vh-aha-scs
    • On the final SWPM confirmation page, ensure that the virtual host name is correct.

  4. After the configuration completes, take the secondary VM out of standby mode:

    # crm_standby -v off -N ${HOSTNAME}; # On SECONDARY

Install the ERS component

  1. On the primary server as root or sidadm, stop the SCS service.

    # su - sid-loweradm -c "sapcontrol -nr scs-instance-number -function Stop"
    # su - sid-loweradm -c "sapcontrol -nr scs-instance-number -function StopService"
  2. On the primary server, enter the following command to put the primary server in standby mode:

    # crm_standby -v on -N ${HOSTNAME};

    Putting the primary server in standby mode consolidates all of the cluster resources on the secondary server, which simplifies installation.

  3. Confirm that the primary server is in standby mode:

    # crm status
  4. On the secondary server as the root user, change your directory to a temporary installation directory, such as /tmp, to install the ERS instance by running the SAP Software Provisioning Manager (SWPM).

    • Use the same user and password to access SWPM that you used when you installed the SCS component.

    • When you start SWPM, use the SAPINST_USE_HOSTNAME parameter to specify the virtual host name that you defined for the ERS VIP address in the /etc/hosts file.

      For example:

      cd /tmp; /mnt/nfs/install/SWPM/sapinst SAPINST_USE_HOSTNAME=vh-aha-ers
    • On the final SWPM confirmation page, ensure that the virtual host name is correct.

  5. Take the primary VM out of standby to have both active:

    # crm_standby -v off -N ${HOSTNAME};

Configure the SAP services

You need to confirm that the services are configured correctly, check the settings in the ASCS and ERS profiles, and add the sidadm user to the haclient user group.

Confirm the SAP service entries

  1. On both servers, confirm that /usr/sap/sapservices contains entries for both the SCS and ERS services. You can add any missing entries by using the sapstartsrv command with options pf=profile-of-the-sap-instance and -reg. For example:

    # LD_LIBRARY_PATH=/usr/sap/hostctrl/exe:$LD_LIBRARY_PATH; export LD_LIBRARY_PATH
    /usr/sap/hostctrl/exe/sapstartsrv \
     pf=/usr/sap/SID/SYS/profile/SID_ERSers-instance-number_ers-virtual-host-name \
     -reg
    /usr/sap/hostctrl/exe/sapstartsrv \
     pf=/usr/sap/SID/SYS/profile/SID_ASCSscs-instance-number_scs-virtual-host-name \
     -reg

Stop the SAP services

  1. On the secondary server, stop the ERS service:

    # su - sid-loweradm -c "sapcontrol -nr ers-instance-number -function Stop"
    # su - sid-loweradm -c "sapcontrol -nr ers-instance-number -function StopService"
  2. On each server, validate that all services are stopped:

    # su - sid-loweradm -c "sapcontrol -nr scs-instance-number -function GetSystemInstanceList"
    # su - sid-loweradm -c "sapcontrol -nr ers-instance-number -function GetSystemInstanceList"

    You should see output similar to the following example:

    18.05.2021 17:39:18
    GetSystemInstanceList
    FAIL: NIECONN_REFUSED (Connection refused), NiRawConnect failed in plugin_fopen()

Edit the SCS and ERS profiles

  1. On either server, switch to the profile directory, by using either of the following commands:

    # cd /usr/sap/SID/SYS/profile
    # cd /sapmnt/SID/profile
  2. If necessary, you can find the files names of your ASCS and ERS profiles by listing the files in the profile directory or use the following formats:

    SID_ASCSscs-instance-number_scs-virtual-host-name
    SID_ERSers-instance-number_ers-virtual-host-name
  3. Enable the package sap-suse-cluster-connector by adding the following lines to the profiles ASCS and ERS instance profiles:

    #-----------------------------------------------------------------------
    # SUSE HA library
    #-----------------------------------------------------------------------
    service/halib = $(DIR_CT_RUN)/saphascriptco.so
    service/halib_cluster_connector = /usr/bin/sap_suse_cluster_connector
  4. If you are using ENSA1, enable the keepalive function by setting the following in the ASCS profile:

    enque/encni/set_so_keepalive = true

    For more information, see SAP Note 1410736 - TCP/IP: setting keepalive interval.

  5. If necessary, edit the ASCS and ERS profiles to change the startup behavior of the Enqueue Server and the Enqueue Replication Server.

    ENSA1

    In the "Start SAP enqueue server" section of the ASCS profile, if you see Restart_Program_nn, change "Restart" to "Start", as shown in the following example.

    Start_Program_01 = local $(_EN) pf=$(_PF)

    In the "Start enqueue replication server" section of the ERS profile, if you see Restart_Program_nn, change "Restart" to "Start", as shown in the following example.

    Start_Program_00 = local $(_ER) pf=$(_PFL) NR=$(SCSID)

    ENSA2

    In the "Start SAP enqueue server" section of the ASCS profile, if you see Restart_Program_nn, change "Restart" to "Start", as shown in the following example.

    Start_Program_01 = local $(_ENQ) pf=$(_PF)

    In the "Start enqueue replicator" section of the ERS profile, if you see Restart_Program_nn, change "Restart" to "Start", as shown in the following example.

    Start_Program_00 = local $(_ENQR) pf=$(_PF) ...

Add the sidadm user to the haclient user group

When you installed the sap-suse-cluster-connector, the installation created an haclient user group. To enable the sidadm user to work with the cluster, add it to the haclient user group.

  1. On both servers, add the sidadm user to the haclient user group:

    # usermod -aG haclient sid-loweradm

Configure the cluster resources for SCS and ERS

  1. As root from either server, place the cluster in maintenance mode:

    # crm configure property maintenance-mode="true"
  2. Confirm that the custer is in maintenance mode:

    # crm status

    If the cluster is in maintenance mode, the status includes the following lines:

                  *** Resource management is DISABLED ***
    The cluster will not attempt to start, stop or recover services
  3. Create the cluster resources for the SCS and ERS services:

    ENSA1

    • Create the cluster resource for the SCS instance. The value of InstanceName is the name of the instance profile that SWPM generated when you installed SCS.

      # crm configure primitive scs-instance-rsc-name SAPInstance \
        operations \$id=scs-instance-rsc-operations-name \
        op monitor interval=11 timeout=60 on-fail=restart \
        params InstanceName=SID_ASCSscs-instance-number_scs-virtual-host-name \
           START_PROFILE="/path-to-profile/SID_ASCSscs-instance-number_scs-virtual-host-name" \
           AUTOMATIC_RECOVER=false \
        meta resource-stickiness=5000 failure-timeout=60 \
           migration-threshold=1 priority=10
    • Create the cluster resource for the ERS instance. The value of InstanceName is the name of the instance profile that SWPM generated when you installed ERS. The parameter IS_ERS=true tells Pacemaker to set the runsersSID flag to 1 on the node where ERS is active.

      # crm configure primitive ers-instance-rsc-name SAPInstance \
        operations \$id=ers-instance-rsc-operations-name \
        op monitor interval=11 timeout=60 on-fail=restart \
        params InstanceName=SID_ERSers-instance-number_ers-virtual-host-name  \
           START_PROFILE="/path-to-profile/SID_ERSers-instance-number_ers-virtual-host-name" \
           AUTOMATIC_RECOVER=false IS_ERS=true \
        meta priority=1000

    ENSA2

    • Create the cluster resource for the SCS instance. The value of InstanceName is the name of the instance profile that SWPM generated when you installed SCS.

      # crm configure primitive scs-instance-rsc-name SAPInstance \
        operations \$id=scs-instance-rsc-operations-name \
        op monitor interval=11 timeout=60 on-fail=restart \
        params InstanceName=SID_ASCSscs-instance-number_scs-virtual-host-name \
           START_PROFILE="/path-to-profile/SID_ASCSscs-instance-number_scs-virtual-host-name" \
           AUTOMATIC_RECOVER=false \
        meta resource-stickiness=5000 failure-timeout=60
    • Create the cluster resource for the ERS instance. The value of InstanceName is the name of the instance profile that SWPM generated when you installed ERS.

      # crm configure primitive ers-instance-rsc-name SAPInstance \
        operations \$id=ers-instance-rsc-operations-name \
        op monitor interval=11 timeout=60 on-fail=restart \
        params InstanceName=SID_ERSers-instance-number_ers-virtual-host-name  \
           START_PROFILE="/path-to-profile/SID_ERSers-instance-number_ers-virtual-host-name" \
           AUTOMATIC_RECOVER=false IS_ERS=true

Configure the resource groups and location constraints

  1. Group the SCS and ERS resources. You can display the names of all your previously defined resources by entering the command crm resource status:

    # crm configure group scs-rsc-group-name scs-file-system-rsc-name \
      scs-health-check-rsc-name scs-vip-rsc-name \
      scs-instance-rsc-name \
      meta resource-stickiness=3000
    # crm configure group ers-rsc-group-name ers-file-system-rsc-name \
      ers-health-check-rsc-name ers-vip-rsc-name \
      ers-instance-rsc-name
  2. Create the colocation constraints:

    ENSA1

    1. Create a colocation constraint that prevents the SCS resources from running on the same server as the ERS resources:

      # crm configure colocation prevent-scs-ers-coloc -5000: ers-rsc-group-name scs-rsc-group-name
    2. Configure SCS to failover to the server where ERS is running, as determined by the flag runsersSID being equal to 1:

      # crm configure location loc-scs-SID-failover-to-ers scs-instance-rsc-name \
      rule 2000: runs_ers_SID eq 1
    3. Configure SCS to start before ERS moves to the other server after a failover:

      # crm configure order ord-sap-SID-first-start-ascs \
       Optional: scs-instance-rsc-name:start \
       ers-instance-rsc-name:stop symmetrical=false

    ENSA2

    1. Create a colocation constraint that prevents the SCS resources from running on the same server as the ERS resources:

      # crm configure colocation prevent-scs-ers-coloc -5000: ers-rsc-group-name scs-rsc-group-name
    2. Configure SCS to start before ERS moves to the other server after a failover:

      # crm configure order ord-sap-SID-first-start-ascs \
       Optional: scs-instance-rsc-name:start \
       ers-instance-rsc-name:stop symmetrical=false
  3. Disable maintenance mode.

    # crm configure property maintenance-mode="false"
  4. Check the configuration of the groups, colocation constraints, and ordering:

    # crm config show

    The output should include lines similar to those in the following example:

    ENSA1

    group ers-aha-rsc-group-name filesystem-rsc-nw-aha-ers health-check-rsc-nw-ha-ers vip-rsc-nw-aha-ers ers-aha-instance-rsc-name
    group scs-aha-rsc-group-name filesystem-rsc-nw-aha-scs health-check-rsc-nw-ha-scs vip-rsc-nw-aha-scs scs-aha-instance-rsc-name \
            meta resource-stickiness=3000
    colocation prevent-aha-scs-ers-coloc -5000: ers-aha-rsc-group-name scs-aha-rsc-group-name
    location fencing-rsc-nw-aha-vm-1-loc fencing-rsc-nw-aha-vm-1 -inf: nw-ha-vm-1
    location fencing-rsc-nw-aha-vm-2-loc fencing-rsc-nw-aha-vm-2 -inf: nw-ha-vm-2
    location loc-sap-AHA-failover-to-ers scs-aha-instance-rsc-name \
            rule 2000: runs_ers_AHA eq 1

    ENSA2

    group ers-aha-rsc-group-name filesystem-rsc-nw-aha-ers health-check-rsc-nw-ha-ers vip-rsc-nw-aha-ers ers-aha-instance-rsc-name
    group scs-aha-rsc-group-name filesystem-rsc-nw-aha-scs health-check-rsc-nw-ha-scs vip-rsc-nw-aha-scs scs-aha-instance-rsc-name \
            meta resource-stickiness=3000
    location fencing-location-nw-aha-vm-1 fencing-rsc-nw-aha-vm-1 -inf: nw-ha-vm-1
    location fencing-location-nw-aha-vm-2 fencing-rsc-nw-aha-vm-2 -inf: nw-ha-vm-2
    order ord-sap-AHA-first-start-ascs Optional: scs-aha-instance-rsc-name:start ers-aha-instance-rsc-name:stop symmetrical=false
    colocation prevent-aha-scs-ers-coloc -5000: ers-aha-rsc-group-name scs-aha-rsc-group-name

Test your cluster

This section shows you how to run the following tests:

  • Check for configuration errors
  • Confirm that the SCS and ERS resources switch servers correctly during failovers
  • Confirm that locks are retained
  • Confirm Compute Engine that maintenance events don't trigger a failover

Check the cluster configuration from SAP

  1. As root on either server, see which instances are active on the server:

    # crm status
  2. Switch to the sidadm user

    # su - sid-loweradm
  3. Check the cluster configuration. For the instance number, specify the instance number of the SCS or ERS instance that is active on the server where you enter the command:

    > sapcontrol -nr instance-number -function HAGetFailoverConfig

    HAActive should be TRUE, as shown in the following example:

    20.05.2021 01:33:25
    HAGetFailoverConfig
    OK
    HAActive: TRUE
    HAProductVersion: SUSE Linux Enterprise Server for SAP Applications 15 SP2
    HASAPInterfaceVersion: SUSE Linux Enterprise Server for SAP Applications 15 SP2 (sap_suse_cluster_connector 3.1.2)
    HADocumentation: https://www.suse.com/products/sles-for-sap/resource-library/sap-best-practices/
    HAActiveNode: nw-ha-vm-1
    HANodes: nw-ha-vm-1, nw-ha-vm-2
  4. As sidadm, check for errors in the configuration:

    > sapcontrol -nr instance-number -function HACheckConfig

    You should see output similar to the following example:

    20.05.2021 01:37:19
    HACheckConfig
    OK
    state, category, description, comment
    SUCCESS, SAP CONFIGURATION, Redundant ABAP instance configuration, 0 ABAP instances detected
    SUCCESS, SAP CONFIGURATION, Redundant Java instance configuration, 0 Java instances detected
    SUCCESS, SAP CONFIGURATION, Enqueue separation, All Enqueue server separated from application server
    SUCCESS, SAP CONFIGURATION, MessageServer separation, All MessageServer separated from application server
    SUCCESS, SAP STATE, SCS instance running, SCS instance status ok
    SUCCESS, SAP CONFIGURATION, SAPInstance RA sufficient version (vh-scs-aha_AHA_00), SAPInstance includes is-ers patch
    SUCCESS, SAP CONFIGURATION, Enqueue replication (vh-scs-aha_AHA_00), Enqueue replication enabled
    SUCCESS, SAP STATE, Enqueue replication state (vh-scs-aha_AHA_00), Enqueue replication active
  5. As root on either server, check which nodes your resources are running on:

    # crm status

    In the following example, the SCS resources are running on the nw-ha-vm-1 server and the ERS resources are running on the nw-ha-vm-2 server.

    # Cluster Summary:
      * Stack: corosync
      * Current DC: nw-ha-vm-2 (version 2.0.4+20200616.2deceaa3a-3.3.1-2.0.4+20200616.2deceaa3a) - partition with quorum
      * Last updated: Thu May 20 16:58:46 2021
      * Last change:  Thu May 20 16:57:31 2021 by ahaadm via crm_resource on nw-ha-vm-2
      * 2 nodes configured
      * 10 resource instances configured
    
    Node List:
      * Online: [ nw-ha-vm-1 nw-ha-vm-2 ]
    
    Active Resources:
      * fencing-rsc-nw-aha-vm-1     (stonith:fence_gce):     Started nw-ha-vm-2
      * fencing-rsc-nw-aha-vm-2     (stonith:fence_gce):     Started nw-ha-vm-1
      * Resource Group: scs-aha-rsc-group-name:
        * filesystem-rsc-nw-aha-scs (ocf::heartbeat:Filesystem):     Started nw-ha-vm-1
        * health-check-rsc-nw-ha-scs        (ocf::heartbeat:anything):       Started nw-ha-vm-1
        * vip-rsc-nw-aha-scs        (ocf::heartbeat:IPaddr2):        Started nw-ha-vm-1
        * scs-aha-instance-rsc-name (ocf::heartbeat:SAPInstance):    Started nw-ha-vm-1
      * Resource Group: ers-aha-rsc-group-name:
        * filesystem-rsc-nw-aha-ers (ocf::heartbeat:Filesystem):     Started nw-ha-vm-2
        * health-check-rsc-nw-ha-ers        (ocf::heartbeat:anything):       Started nw-ha-vm-2
        * vip-rsc-nw-aha-ers        (ocf::heartbeat:IPaddr2):        Started nw-ha-vm-2
        * ers-aha-instance-rsc-name (ocf::heartbeat:SAPInstance):    Started nw-ha-vm-2
  6. As sidadm on the server where SCS is active, simulate a failover:

    > sapcontrol -nr scs-instance-number -function HAFailoverToNode ""
  7. As root, if you follow the failover by using crm_mon, you should see SCS move to the other server, ERS stop on that server, and then ERS move to the server that SCS used to be running on.

Confirm lock entries are retained

To confirm lock entries are preserved across a failover, first select the tab for your version of the Enqueue Server and the follow the procedure to generate lock entries, simulate a failover, and confirm that the lock entries are retained after SCS is activated again.

ENSA1

  1. As sidadm, on the server where ERS is active, generate lock entries by using the enqt program:

    > enqt pf=/path-to-profile/SID_ERSers-instance-number_ers-virtual-host-name 11 number-of-locks
  2. As sidadm, on the server where SCS is active, verify that the lock entries are registered:

    > sapcontrol -nr scs-instance-number -function EnqGetStatistic | grep locks_now

    If you created 10 locks, you should see output similar to the following example:

    locks_now: 10
  3. As sidadm, on the server where ERS is active, start the monitoring function, OpCode=20, of the enqt program:

    > enqt pf=/path-to-profile/SID_ERSers-instance-number_ers-virtual-host-name 20 1 1 9999

    For example:

    > enqt pf=/sapmnt/AHA/profile/AHA_ERS10_vh-ers-aha 20 1 1 9999
  4. Where SCS is active, reboot the server.

    On the monitoring server, by the time Pacemaker stops ERS to move it to the other server, you should see output similar to the following.

    Number of selected entries: 10
    Number of selected entries: 10
    Number of selected entries: 10
    Number of selected entries: 10
    Number of selected entries: 10
  5. When the enqt monitor stops, exit the monitor by entering Ctrl + c.

  6. Optionally, as root on either server, monitor the cluster failover:

    # crm_mon
  7. As sidadm, after you confirm the locks were retained, release the locks:

    > enqt pf=/path-to-profile/SID_ERSers-instance-number_ers-virtual-host-name 12 number-of-locks
  8. As sidadm, on the server where SCS is active, verify that the lock entries are removed:

    > sapcontrol -nr scs-instance-number -function EnqGetStatistic | grep locks_now

ENSA2

  1. As sidadm, on the server where ERS is active, generate lock entries by using the enq_adm program:

    > enq_admin --set_locks=number-of-locks:X:DIAG::TAB:%u pf=/path-to-profile/SID_ERSers-instance-number_ers-virtual-host-name
  2. As sidadm, on the server where SCS is active, verify that the lock entries are registered:

    > sapcontrol -nr scs-instance-number -function EnqGetStatistic | grep locks_now

    If you created 10 locks, you should see output similar to the following example:

    locks_now: 10
  3. Where SCS is active, reboot the server.

  4. Optionally, as root on either server, monitor the cluster failover:

    # crm_mon
  5. As sidadm, on the server where SCS was restarted, verify that the lock entries were retained:

    > sapcontrol -nr scs-instance-number -function EnqGetStatistic | grep locks_now
  6. As sidadm on the server where ERS is active, after you confirm the locks were retained, release the locks:

    > enq_admin --release_locks=number-of-locks:X:DIAG::TAB:%u pf=/path-to-profile/SID_ERSers-instance-number_ers-virtual-host-name
  7. As sidadm, on the server where SCS is active, verify that the lock entries are removed:

    > sapcontrol -nr scs-instance-number -function EnqGetStatistic | grep locks_now

    You should see output similar to the following example:

    locks_now: 0

Simulate a Compute Engine maintenance event

Simulate a Compute Engine maintenance event to make sure that live migration does not trigger a failover.

The timeout and interval values that are used in these instructions account for the duration live migrations. If you use shorter values, the risk that live migration will trigger a failover is greater.

To test the tolerance of your cluster for live migration:

  1. On the primary node, trigger a simulated maintenance event by using following Cloud SDK command:

    # gcloud compute instances simulate-maintenance-event primary-instance-name
  2. Confirm that the primary node does not change:

    # crm status

Troubleshooting

To troubleshoot problems with high-availability configurations for SAP NetWeaver on SLES, see Troubleshooting high-availability configurations for SAP.

Getting support for SAP NetWeaver on SLES

If you need help resolving a problem with high-availability clusters for SAP NetWeaver on SLES, gather the required diagnostic information and contact Cloud Customer Care. For more information, see High-availability clusters on SLES diagnostic information.

Support

For issues with Google Cloud infrastructure or services, contact Customer Care. You can find contact information on the Support Overview page in the Google Cloud Console. If Customer Care determines that a problem resides in your SAP systems, you are referred to SAP Support.

For SAP product-related issues, log your support request with SAP support. SAP evaluates the support ticket and, if it appears to be a Google Cloud infrastructure issue, transfers the ticket to the Google Cloud component BC-OP-LNX-GOOGLE or BC-OP-NT-GOOGLE.

Support requirements

Before you can receive support for SAP systems and the Google Cloud infrastructure and services that they use, you must meet the minimum support plan requirements.

For more information about the minimum support requirements for SAP on Google Cloud, see:

Performing post-deployment tasks

Before using your SAP NetWeaver system, we recommend that you backup your new SAP NetWeaver HA system.

For more information:

What's next

See the following resource for more information:

+ High-availability planning guide for SAP NetWeaver on Google Cloud + SAP on Google Cloud: High availability white paper + For more information about VM administration and monitoring, see the SAP NetWeaver Operations Guide