This guide provides instructions for operating SAP HANA systems deployed on Google Cloud Platform (GCP) by following the SAP HANA on GCP deployment guide. Note that this guide is not intended to replace any of the standard SAP documentation.
Administering an SAP HANA system on GCP
This section shows how to perform administrative tasks typically required to operate an SAP HANA system, including information about starting, stopping, and cloning systems.
Starting and stopping instances
You can stop one or multiple SAP HANA hosts at any time; stopping an instance shuts down the instance. If the shutdown doesn't complete within 2 minutes, the instance is forced to halt. As a best practice, you should first stop SAP HANA running on the instance before you stop the instance.
Stopping a VM
Stopping a VM instance causes Google Compute Engine to send the ACPI power-off signal to the instance. You are not billed for the Compute Engine instance after the instance is stopped. If you have persistent disks attached to the instance, the disks are not deleted and you will charged for them.
If the data on the persistent disk is important, you can either keep the disk or create a snapshot of the persistent disk and delete the disk to save on costs. You can create another disk from the snapshot when you need the data again.
To stop an instance:
In the Google Cloud Platform Console, navigate to the:
Select one or more instances that you want to stop.
At the top of the VM instances page, click Stop.
For more information, see Stopping an instance.
Restarting a VM
In the Cloud Platform Console, navigate to the:
Select the instances that you want to restart.
At the top right-hand of the page, click the Start button to restart the instances.
For more information, see Restarting an instance.
Creating a snapshot of SAP HANA
To generate a point-in-time backup of your persistent disk, you can create a snapshot. Compute Engine redundantly stores multiple copies of each snapshot across multiple locations with automatic checksums to ensure the integrity of your data.
Snapshots are useful for the following use cases:
|Provide an easy, software-independent, and cost-effective data backup solution.||Backup your data, log, backup and shared disks with snapshots. Schedule a daily backup of these disks for point in time backups of your entire dataset. After the first snapshot, only the incremental block changes are stored in subsequent snapshots. This helps save costs.|
|Migrate to a different storage type.||Persistent disks have two different storage types, standard (magnetic) and SSD, that have different cost and performance characteristics. For example, use standard for your backup volume and use SSD for your log and data volume, since they require higher performance. To migrate between storage types, use the volume snapshot, then create a new volume using the snapshot and select a different storage type.|
|Migrate SAP HANA to another region or zone.||Use snapshots to move your SAP HANA system from one zone to another zone in the same region or even to another region. Snapshots can be used globally within GCP to create disks in another zone or region. To move to another region or zone, create a snapshot of your disks including the root disk and then create the virtual machines in your desired zone/region with disks created from those snapshots.|
Cloning your SAP HANA system
You can create snapshots of an existing SAP HANA system on GCP to create an exact clone of the system. See the following section for additional information.
To clone a single-node SAP HANA system:
Create a snapshot of your data and backup disks.
Create new disks using the snapshots.
In the Cloud Platform Console, navigate to the:
Click on the instance to clone to open the instance detail page, then click Clone.
Attach the disks that were created from the snapshots.
To clone a multi-node SAP HANA system:
Provision a new SAP HANA system with the same configuration as the SAP HANA system you want to clone.
Perform a data backup of the original system.
Restore the backup of the original system into the new system.
Backup and recovery
Backups are vital for protecting your System of Record (your database). Because SAP HANA is an in-memory database, you should create regular backups so you can recover from instances of data corruption. SAP HANA system provides native backup and recovery features to help you do this. You can use GCP services such as Google Cloud Storage to serve as the backup destination for SAP HANA backup.
This document assumes you are familiar with SAP HANA backup and recovery, along with the following SAP service Notes:
- 1642148: FAQ: SAP HANA Database Backup & Recovery
- 1821207: Determining required recovery files
- 1869119: Checking backups using
- 1873247: Checking recoverability with
- 1651055: Scheduling SAP HANA Database Backups in Linux
Using Compute Engine and Cloud Storage for backups
If you followed the
you have an SAP HANA single node installation with a
This is backed by using a standard persistent disk. You use the standard SAP
tools to create your online database backups to
Finally, you save the completed backup by uploading it to a Cloud Storage
bucket, from which you can download the backup, when you need to recover.
Using Compute Engine to create backups and disk snapshots
You can use Compute Engine for SAP HANA backups, and you also have the option of backing up the entire disk hosting your data and log using persistent-disk snapshots.
If you followed the instructions in the deployment
guide, you have an SAP HANA
single node installation with a
/hanabackup directory for your online database
backups. You can use that same directory to store snapshots of the backup volume
and maintain a point-in-time backup of your data and log.
An advantage of snapshots is that they are incremental, where each subsequent backup only stores incremental block changes instead of creating an entirely new backup. Compute Engine redundantly stores multiple copies of each snapshot across multiple locations with automatic checksums to ensure the integrity of your data.
Here is an illustration of the incremental backups:
Cloud Storage as your backup destination
Cloud Storage is a good choice to use as your backup destination for SAP HANA because it provides high durability and availability of data.
Cloud Storage is an object store for files of any type or format. It has virtually unlimited storage and you do not have to worry about provisioning it or adding more capacity to it. An object in Cloud Storage consists of file data and its associated metadata, and can be up to 5 TB in size. A Cloud Storage bucket can store any number of objects.
With Cloud Storage, your data is stored in multiple locations, which provides high durability and high availability. When you upload your data to Cloud Storage or copy your data within it, Cloud Storage reports the action as successful only if object redundancy is achieved.
The following table shows the different storage options you have if you use Cloud Storage:
|Data access needed||Cloud Storage options recommended|
|Frequent access||Choose Multi-Regional or Regional storage class for backups accessed
multiple times in a month. Multi-Regional is useful for disaster recovery
because its data is stored redundantly in at least two regions separated by at
least 100 miles within the multi-regional location of the bucket. Note that data
stored as Multi-Regional Storage can be placed only in
locations, such as the United
States, the European Union, or Asia, not specific regional locations such as
|Infrequent access||Choose Nearline or Coldline storage for infrequently accessed data. Nearline is a good choice for backed-up data you plan to access at most once a month, while Coldline is better for data that has very low probability of access, perhaps once a year at most. Consider replacing your tape-based backup solution with Nearline or Coldline.|
When you plan your usage of these storage options, start with the frequently accessed tier and age your backup data through to the infrequent access tiers. Backups generally become rarely used as they become older. The probability of needing a backup that is 3 years old is extremely low and you can age this backup into the Coldline tier to save on costs, which are currently 7/10th of a cent per GB/month.
Cloud Storage compared to tape backup
The traditional, on-premises backup destination is tape. Cloud Storage has many benefits over tape, including the ability to automatically store backups “offsite” from the source system, since data in Cloud Storage is replicated across multiple facilities. This also means that the backups stored in Cloud Storage are highly available.
Another key difference is the speed of restoring backups when you need to use them. If you need to create a new SAP HANA system from backup or restore an existing system from backups, Cloud Storage provides faster access to your data and helps you build the system faster.
Managing identity and access to backups
When you use Cloud Storage or Compute Engine to back up your SAP HANA data, access to those backups is controlled by Google Cloud Identity and Access Management (IAM). This feature gives administrators the ability to authorize who can take action on specific resources. IAM provides you with centralized, full control and visibility for managing all of your GCP resources, including your backups.
IAM also provides a full audit trail history of permissions authorization, removal, and delegation gets surfaced automatically for your admins. This allows you to configure policies that monitor access to your data in the backups, allowing you to complete the full access-control cycle with your data. Cloud IAM provides a unified view into security policy across your entire organization, with built-in auditing to ease compliance processes.
To grant access to backups in Cloud Storage:
Specify the user you are granting access to, and assign the role Storage > Storage Object Creator:
How to make backups
SAP HANA systems provisioned using the deployment guide are configured with a set of persistent disk volumes to be used as an NFS-mounted backup destination. SAP HANA backups are first stored on these local persistent disk volumes, and should then copied to Cloud Storage for long-term storage. You can either manually copy the backups over to Cloud Storage or schedule the copy to Cloud Storage in a crontab.
You can use SAP HANA Studio, SQL commands, or the DBA Cockpit to start or schedule SAP HANA data backups. Log backups are written automatically unless disabled. The following screenshot shows an example:
Configuring SAP HANA
If you followed the deployment guide
instructions, the SAP HANA
global.ini configuration file is customized with
database backups stored in
/backup/data/ and automatic log archival files are
/backup/log/, as follows:
[passport] enable = false [persistence] continuous_flush_interval_s = 0 tablepreload_write_interval = 0 savepoint_interval_s = 0 enable_auto_log_backup = no log_preformat_segment_count = 10 log_segment_size_mb = 4096 log_mode = overwrite basepath_datavolumes = /hana/data basepath_logvolumes = /hana/log basepath_databackup = /hanabackup/data basepath_logbackup = /hanabackup/log [resource_tracking] service_thread_sampling_monitor_enabled = false [system_information] usage = production [trace] compressioninterval = 0
Gcloud Python is an idomatic Python client that you can use to access GCP services. This guide uses Gcloud Python to perform backup and restore operations to and from Cloud Storage for your SAP HANA database backups.
If you followed the deployment guide instructions, Gcloud python libraries are already available in the Compute Engine instances.
The libraries are open source and allow you to operate on your Cloud Storage bucket to store and retrieve backup data.
You can run the following command to list objects in your Cloud Storage bucket. You can use it to list the backup objects available:
python 2>/dev/null - <<EOF from google.cloud import storage storage_client = storage.Client() bucket = storage_client.get_bucket("<bucket_name>") blobs = bucket.list_blobs() for fileblob in blobs: print(fileblob.name) EOF
For complete details about Gcloud Python, see the storage client library reference documentation.
Setting up your SAP support channel with SAProuter
If you need to allow an SAP support engineer to access your SAP HANA systems on GCP, you can do so using SAProuter. Follow these steps:
Launch the Compute Engine VM instance that the SAProuter software will be installed on, and assign an external IP so the instance has Internet access.
Create and configure a specific SAProuter firewall rule in your network. In this rule, allow only the required inbound and outbound access to the SAP support network, for the SAProuter instance.
This inbound and outbound access should be limited to a specific IP address that SAP provides for you to connect to, along with TCP port 3299. Add a target tag to your firewall rule and enter your instance name. This ensures that the firewall rule applies only to the new instance. See the firewall rules documentation for additional details about creating and configuring firewall rules.
Install the SAProuter software, following SAP Note 1628296, and create a
saprouttabfile that allows access from SAP to your SAP HANA systems on GCP.
Set up the connection with SAP. For your Internet connection, use Secure Network Communication. For more information, see SAP Remote Support – Help.
Configuring your network
You are provisioning your SAP HANA system using virtual machines with the GCP virtual network. GCP uses state-of-the-art, software-defined networking and distributed-systems technologies to host and deliver your services around the world.
For SAP HANA, create a non-default subnet network with non-overlapping CIDR IP ranges for each subnetwork in the network. Note that each subnetwork and its internal IP ranges are mapped to a single region.
A subnetwork spans all of the zones within the region where it is created.
However, when you create an VM instance, you specify a zone and a subnetwork for
the VM. For example, you can create one set of instances in
subnetwork1 and in
region1 and another set of instances in
subnetwork2 and in
region1, depending on your needs.
A new network has no firewall rules and hence no network access. You should create firewall rules that open access to your SAP HANA instances based on a minimum privilege model. The firewall rules apply to the entire network and can also be configured to apply to specific target instances by using the tagging mechanism.
Routes are global, not regional, resources that are attached to a single network. User-created routes apply to all instances in a network. This means you can add a route that forwards traffic from instance to instance within the same network, even across subnetworks, without requiring external IP addresses.
For your SAP HANA instance, launch the instance with no external IP address and configure another virtual machine as a NAT gateway for external access. This configuration requires you to add your NAT gateway as a route for your SAP HANA instance. This procedure is described in the deployment guide.
The following sections discuss security operations.
Minimum privilege model
Your first line of defense is to restrict who can reach the instance by using firewalls. By creating firewall rules, you can restrict all traffic to a network or target machines on a given set of ports to specific source IP addresses. You should follow the minimum-privilege model to restrict access to the specific IP addresses, protocols, and ports that need access. For example, you should always set up a bastion host, and allow SSH into your SAP HANA system only from that host.
You should configure your SAP HANA system and the operating system with recommended security settings. For example, make sure that only relevant network ports are whitelisted for access, harden the operating system you are running SAP HANA, and so on.
Refer to the following SAP notes:
- 1944799: Guidelines for SLES SAP HANA installation
- 1730999: Recommended configuration changes
- 1731000: Unrecommended configuration changes
Disabling unneeded SAP HANA Services
If you do not require SAP HANA Extended Application Services (SAP HANA XS), disable the service. Refer to SAP note 1697613: Remove XS Engine out of SAP HANA database.
After the service has been disabled, remove all the TCP ports that were opened for the service. In GCP, this means editing your firewall rules for your network to remove these ports from the whitelist.
Google Cloud Audit Logging consists of two log streams, Admin Activity and Data Access, both of which are automatically generated by GCP. These can help you answer the questions, "Who did what, where, and when?" in your GCP projects.
Admin Activity logs contain log entries for API calls or administrative actions that modify the configuration or metadata of a service or project. This log is always enabled and is visible by all project members.
Data Access logs contain log entries for API calls that create, modify, or read user-provided data managed by a service, such as data stored in a database service. This type of logging is enabled by default in your project and is accessible to you through Stackdriver Logging, or through your activity feed.
Securing a Cloud Storage bucket
If you use Cloud Storage to host your backups for your data and log, make sure you use TLS (HTTPS) while sending data to Cloud Storage from your instances to protect data in transit. Cloud Storage automatically encrypts data at rest. You can specify your own encryption keys if you have your own key-management system.
Refer to the Cloud Storage security documentation for best practices for Cloud Storage.
Related security documents
Refer to the following additional security resources for your SAP HANA environment on GCP:
- Security Center
- Compliance in the Google Cloud
- Google Cloud security whitepaper
- Google Infrastructure security design
High availability for SAP HANA on GCP
Instead of having to purchase and maintain a replica in the Cloud, you can obtain high availability for SAP HANA on GCP by ensuring your single node SAP HANA instance is protected by using the GCP automatic-recovery features. GCP and SAP provide several features that handle failures at the infrastructure level to help improve availability, as described in the following table.
|Live Migration||Compute Engine monitors the state of the underlying infrastructure and automatically migrates your instance away from an infrastructure maintenance event, keeping your instance running during the migration. No user intervention is required. A recovered instance is identical to the original instance, including the instance ID, private IP address and all instance metadata and storage. By default, standard instances are set to live migrate.|
|Automatic Instance Restart||Compute Engine signals your instance to shut down, waits for a short period of time for your instance to shut down cleanly, terminates the instance, and restarts it away from the maintenance event. This option is ideal for instances that demand constant, maximum performance, and your overall application is built to handle instance failures or reboots.|
|SAP HANA Automatic Service Restart||SAP HANA Service Auto-Restart is a fault recovery solution provided out of the box by SAP. SAP HANA has many configured services running all the time for various activities. When any of these services is disabled due to a software failure or human error, the service is automatically restarted with the SAP HANA Service AutoRestart watchdog function. When the service is restarted, it loads all the necessary data back into memory and resumes its operation.|
The SAP HANA system provides several high availability features to make sure that your SAP HANA database can withstand failures at the software or infrastructure level. Among these features is SAP HANA System replication, which GCP supports. Note that any restrictions defined by SAP, including distance limitation for synchronous replication, are also in effect on GCP.
The following sections show some of the supported recovery scenarios.
SAP HANA system replication with preload
In this scenario, your SAP HANA database is replicated to a dedicated standby instance that has a unique hostname and its own persistent disks attached:
We recommend that you use synchronous replication, where the SQL transactions are not committed on the primary node until they are committed on the secondary node. This keeps the disaster recovery instance 100% in sync and ensures a zero recovery point objective.
Synchronous replication should not be used to replicate between different regions or zones. If your target disaster-recovery instance is in a different zone or region, you must use asynchronous replication, where there is no wait for the secondary instance to acknowledge the data before the commit on the primary. In this case, you could lose small amounts of data if a disaster happens. This level of loss is usually acceptable in a disaster scenario.
If you need to fail over to the disaster-recovery instance, you must execute a manual command on the disaster-recovery instance. Because all the data is preloaded, the failover time only takes around 90 seconds. Because the disaster- recovery instance has its own hostname and IP address, you might need to manually update any applications that use the SAP HANA database with the new hostname details.
SAP HANA system replication without preload
In this scenario, the memory requirements are much smaller because the data is not preloaded into memory. You need only 64 GB of memory or the amount of memory which is consumed by the rowstore on the target host, whichever is larger. This gives you GCP instance options that can result in cost savings:
- The target instance can be located on the same GCP instance as the development or test systems. When the disaster-recovery system needs to be invoked, you shut down dev and test, perform the manual failover command, and then remove the global allocation limit. This is the setting that restricts the disaster- recover instance to around 64 GB of memory.
- The dedicated target instance could be on a GCP instance that isn’t supported
for SAP HANA, and thus has a much lower running cost. For example, the
n1-highmem-16instance in a disaster-recovery scenario performs the normal takeover, and then you change the GCP instance type to a supported instance size.
Both synchronous and asynchronous replication are supported. However, if you plan on using synchronous replication, the source and target instances must reside in the same GCP zone.
Starting with SAP HANA SPS09, you can also use the Python-based API included with SAP HANA to create your own high-availability/disaster-recovery (HA/DR) provider and integrate it with the SAP HANA System Replication takeover process to automate tasks such as shutting down the dev and/or test systems in the secondary node. To learn more about the HA/DR provider implementation, see Implementing a HA/DR Provider on the SAP website.
Restoring from backup
In the case that a longer recovery time objective is acceptable, disaster recovery can be achieved by restoring from backup. There are two caveats:
Your backups are offloaded to Cloud Storage. To ensure you are protected from the failure of an entire region, ensure that you create a multi-region bucket.
Log backups are continuously offloaded to Cloud Storage and that your recovery point objective is not less than 15 minutes.
Triggering a failover
To invoke disaster recovery, you must trigger the SAP HANA System Replication Takeover procedure in your secondary/disaster-recovery system. To trigger the takeover, follow the standard SAP HANA takeover process.
SAP OSS Note 2063657 provides guidelines to help you decide whether takeover is the best choice.
Here are the steps you might take for a typical backup task, using SAP HANA Studio as an example:
In the SAP HANA Backup Editor, select Open Backup Wizard.
- Select File as the destination type. This backs up the database to files in the specified file system.
- Specify the backup destination,
/hanabackup/data/[SID], and the backup prefix. Be sure to replace
[SID]with the proper SAP SID.
- Click Next.
Click Finish in the confirmation form to start the backup.
When the backup starts, a status window displays the progress of your backup. Wait for the backup to complete.
When the backup is complete, the backup summary displays a "Finished" message.
Sign in to your SAP HANA system and verify that the backups are available at the expected locations in the file system. For example:
Push or synchronize the backup files from the
/hanabackupfile system to Cloud Storage. The following sample Python script pushes the data from
/hanabackup/logto the bucket used for backups, in the form
[LOG]/YYYY/MM/DD/HH/[BACKUP_FILE_NAME]. This allows you to identify backup files based on the time during which the backup was copied. Run this
gcloud Pythonscript on your operating system bash prompt:
python 2>/dev/null - <<EOF import os import socket from datetime import datetime from google.cloud import storage storage_client = storage.Client() today = datetime.today() current_hour = today.strftime('%Y/%m/%d/%H') hostname = socket.gethostname() bucket = storage_client.get_bucket("hanabackup") for subdir, dirs, files in os.walk('/hanabackup/data/H2D/'): for file in files: backupfilename = os.path.join(subdir, file) if 'COMPLETE_DATA_BACKUP' in backupfilename: only_filename = backupfilename.split('/')[-1] backup_file = hostname + '/data/' + current_hour + '/' + only_filename blob = bucket.blob(backup_file) blob.upload_from_filename(filename=backupfilename) for subdir, dirs, files in os.walk('/hanabackup/log/H2D/'): for file in files: backupfilename = os.path.join(subdir, file) if 'COMPLETE_DATA_BACKUP' in backupfilename: only_filename = backupfilename.split('/')[-1] backup_file = hostname + '/log/' + current_hour + '/' + only_filename blob = bucket.blob(backup_file) blob.upload_from_filename(filename=backupfilename) EOF
Use either the Gcloud Python libraries or Cloud Platform Console to list the backup data.
To restore your SAP HANA database from a backup:
If the backup files are not available already in the
/hanabackupfile system but are in Cloud Storage, download the files from Cloud Storage, by running the following script from your operating system bash prompt:
python - <<EOF from google.cloud import storage storage_client = storage.Client() bucket = storage_client.get_bucket("hanabackup") blobs = bucket.list_blobs() for fileblob in blobs: blob = bucket.blob(fileblob.name) fname = str(fileblob.name).split('/')[-1] blob.chunk_size=1<<30 if 'log' in fname: blob.download_to_filename('/hanabackup/log/H2D/’ + fname) else: blob.download_to_filename('/hanabackup/data/H2D/’ + fname) EOF
To recover the SAP HANA database, click Backup and Recovery > Recover System:
Specify the location of your backups in your local filesystem and click Add .
Select Recover without the backup catalog:
Select File as the destination type, then specify the location of the backup files and the correct prefix for your backup. (In the backup example, remember that you used
COMPLETE_DATA_BACKUPas the prefix.)
Click Next twice.
Click Finish to start the recovery.
When recovery completes, resume normal operations and remove backup files from the
Notes for scale-out deployments
In the scale-out implementation, the previously described high-availability
solution using live migration and automatic restart works in the same way as in
a single node setup. The main difference is that the
/hana/shared volume is
NFS-mounted to all the worker nodes and mastered in the HANA master. There is a
very brief period of inaccessibility on the NFS volume in the event of master
node's live migration or auto restart. When the master node has restarted, the
NFS volume will soon function again on all nodes, and normal operation resumes
The backup volume,
/hanabackup, must be available on all nodes during backup
and recovery operations. In the event of failure, you should verify the
/hanabackup is mounted on all nodes and remount any that are not. When you
choose to copy the backup set to another volume or Cloud Storage, you should run
the copy on the master node to achieve better IO performance and reduce network
usage. To simplify the backup and recovery process, you can use Cloud Storage
Fuse to mount the Cloud Storage bucket on each node.
The scale-out performance is only as good as your data distribution. The better the data is distributed, the better your query performance will be. This requires that you know your data well, understand how the data is being consumed, and design table distribution and partitioning accordingly. Please refer to SAP Note 2081591.
You might find the following standard SAP documents helpful:
You might also find the following GCP documents useful: