You can send SAP HANA backups directly to Cloud Storage from SAP HANA instances that are running on Google Cloud, on Bare Metal Solution, on premises, or on other cloud platforms by using the SAP-certified Cloud Storage Backint agent for SAP HANA (Backint agent).
The Backint agent is integrated with SAP HANA so that you can store and retrieve backups directly from Cloud Storage by using the native SAP backup and recovery functions.
When you use the Backint agent, you don't need to use persistent disk storage for backups.
For installation instructions for the Backint agent, see the Cloud Storage Backint agent for SAP HANA installation guide.
For more information about the SAP certification of the Backint agent, see:
The Backint agent configuration file
You configure the Backint agent by specifying parameters in a plain text file.
The default configuration file is called parameters.txt
and the default location is
/usr/sap/SID/SYS/global/hdb/opt/backint/backint-gcs/parameters.txt
.
You can specify multiple configuration files by giving each file a different name.
For example, you might specify a configuration for log backups in a file
called backint-log-backups.txt
and a configuration for data backups
in a file called backint-data-backups.txt
.
Storing backups in Cloud Storage buckets
The Backint agent stores your SAP HANA backups in a Cloud Storage bucket.
When you create a bucket, you can choose the bucket location and the bucket storage class.
A bucket location can be regional, dual-regional, or multi-regional. Which you choose depends on your need to restrict the location of your data, your latency requirements for backups and restores, as well as your need for protection against regional outages. For more information, see Bucket locations.
Select dual- or multi-regional buckets in regions that are the same as or close to the regions in which your SAP HANA instances are running.
Choose a storage class based on how long you need to keep your backups, how frequently you expect to access them, and the cost. For more information, see Storage classes.
Multistreaming data backups with the Backint agent
For versions prior to SAP HANA 2.0 SP05, SAP HANA supports multi-streaming for
databases larger than 128 GB. As of SAP HANA 2.0 SP05, this threshold is now
configurable via the SAP HANA parameter
parallel_data_backup_backint_size_threshold
, which specifies the minimum
database backup size in GB for multistreaming to be enabled.
Multistreaming is useful for increasing throughput and for backing up databases that are larger than 5 TB, which is the maximum size for a single object in Cloud Storage.
The optimum number of channels that you use for multistreaming depends on the Cloud Storage bucket type you are using and the environment in which SAP HANA is running. Also consider the throughput capability of the data disk attached to your HANA instance, as well as the bandwidth your administrator allocates for backup activities.
You can adjust the throughput by changing
the number of streams, or limit throughput by using the #RATE_LIMIT_MB
parameter in parameters.txt
, the Backint agent
configuration file.
For a multi-regional bucket, start with 8 channels by setting the
parallel_data_backup_backint_channels
parameter to 8
in the SAP HANA
global.ini
configuration file.
For a regional bucket, start with 12 channels by setting the
parallel_data_backup_backint_channels
in the global.ini
file to 12
.
Adjust the number of channels as necessary to meet your backup performance objectives.
As stated in the SAP HANA documentation, each additional channel requires an I/O
buffer of 512 MB. Specify the size of the I/O buffer by using the
data_backup_buffer_size
parameter appropriately in the backup section of the
global.ini
file. For more information regarding the effect of the IO buffer
size on backup times, see
SAP Note 2657261.
As of HANA 2.0 SP05 SAP specifies a maximum value for this parameter of 4 GB.
Testing in Google Cloud has not shown a benefit in increasing the buffer size
significantly beyond the default, but this may vary for your workload.
For more information about multistreaming, in the SAP HANA Administration Guide that is specific to your SAP HANA version, see Multistreaming Data Backups with Third-Party Backup Tools.
Parallel uploads
You can improve the upload performance of log backup files by enabling the parallel upload feature of the Backint agent. This is especially useful for log backup files because they cannot be multi-streamed from SAP HANA.
For data backups, you can tune the number of SAP HANA backup
channels by using only the SAP HANA parameter
parallel_data_backup_backint_channels
.
When parallel upload is enabled, the Backint agent splits each individual backup file that is received from SAP HANA into multiple parts that are then uploaded in parallel, which improves upload performance.
As the parts are received by Cloud Storage, they are reassembled and stored as the original single file that was received by Backint agent from SAP HANA. The single file is subject to the 5 TB size limit for objects in Cloud Storage.
Configuring parallel upload
The parallel upload feature is enabled in the parameters.txt
configuration file by specifying the maximum number of parallel
upload threads on the #PARALLEL_FACTOR
parameter.
The parameters #PARALLEL_PART_SIZE_MB
,
which sets the size of each part, and #THREADS
, which
determines the number of worker threads, are for advanced tuning only.
Don't change these settings unless you are instructed to do so by
Cloud Customer Care. The default values rarely need to be changed.
For more information about the parallel upload parameters, see Configuration options for the Backint agent.
Parallel upload restrictions
The following restrictions apply to the parallel upload feature:
- If you enable encryption with either the
#ENCRYPTION_KEY
or#KMS_KEY_NAME
configuration parameter, then you cannot use parallel upload. Encryption is incompatible with parallel upload. If you specify the#PARALLEL_FACTOR
parameter with either of these encryption parameters, then the Backint agent exits with a status of1
. - If you enable compression, then you cannot use parallel upload. Compression is
incompatible with parallel upload. From version
1.0.22, if you specify the
#PARALLEL_FACTOR
parameter and omit the#DISABLE_COMPRESSION
parameter in your configuration, then the Backint agent exits with status of1
. - If your Cloud Storage bucket implements a retention policy, then the bucket does not support parallel uploads. A retention policy prevents the reassembly of the parts into a single file, which causes the upload to fail.
For more information about the parallel upload parameters, see Configuration options for the Backint agent.
Tuning parallel upload
For log backups, parallel uploads can significantly improve the backup
throughput because SAP HANA does not multistream log backups. In most cases,
specifying a #PARALLEL_FACTOR
of 16 or less is sufficient. For
very large log volumes, you can maximize the throughput by using a
high #PARALLEL_FACTOR
value, such as 16
, and increasing the values for the
SAP HANA parameters log_segment_size_mb
and max_log_backup_size
.
In some cases, using a high #PARALLEL_FACTOR
value can decrease the
overall throughput, such as might happen if you are also
using a high number of parallel backup channels.
To limit the network bandwidth that your backups use, use #RATE_LIMIT_MB
to set the maximum amount of bandwidth that parallel uploads can use.
To find a good setting for your specific environment, workload, and backup type, perform tests with different settings and measure the backup throughput.
Authentication and access control for the Backint agent
Google Cloud uses service accounts to identify programs like the Backint agent and to control which Google Cloud resources the programs can access.
Required Cloud Storage permissions
A service account for the Backint agent must be granted permissions to the Google Cloud resources that the Backint agent accesses. The Storage Object Admin role provides list, get, create, and delete permissions for objects in Cloud Storage buckets.
You can set the permissions for the service account at the project level or the bucket level. If you set it at the project level, you give the Backint agent access to all of the buckets in your project. If you set it at the bucket level, you give the Backint agent access to only a single bucket. For more information about Cloud Storage bucket permissions, see:
Service account options for the Backint agent
If SAP HANA is running on a Compute Engine VM, by default, the Backint agent uses the service account of the VM.
If you use the VM service account, the Backint agent has the same project-level permissions as all of the other programs and processes that use the VM service account.
For the strictest access control, create a separate service account for the Backint agent and grant the service account access to the bucket at the bucket level.
If SAP HANA is not running on a Compute Engine VM, you must create a service account for the Backint agent. Create the service account in the Google Cloud project that contains the Cloud Storage bucket that the Backint agent will use.
When you create a service account for the Backint agent, you also
need to create a service account key.
You store the key on the SAP HANA host and specify the path to the key in the
parameters.txt
file. When SAP HANA is running on
a Compute Engine VM,
specifying the path to a key directs the Backint agent to use the
service account that is associated with the key instead of the
VM service account.
If you use a customer-managed encryption key that is generated by Cloud Key Management Service to encrypt your backups in Cloud Storage, you need to give your service account access to the encryption key. For more information, see Assigning a Cloud KMS key to a service account.
Access to Google Cloud APIs and metadata servers
The Backint agent requires access to the following Google Cloud IP addresses and hosts during backup and recovery operations:
- For access to Cloud Storage:
- Version 1.0.14 and later of the agent:
storage.googleapis.com
- Version 1.0.13 and earlier:
www.googleapis.com
- Version 1.0.14 and later of the agent:
- If you specify a service account on the #SERVICE_ACCOUNT property,
oauth2.googleapis.com
for authentication. 169.254.169.254
for the Compute Engine instance metadata server which, by default, resolves internal DNS names.metadata.google.internal
also for VM instance metadata.
If the Backint agent and SAP HANA are running on a Compute Engine VM that does not have access to the internet, you need to configure Private Google Access so that Backint agent can interact with Cloud Storage and, if using a dedicated service account, authenticate itself with Google Cloud.
To configure Private Google Access, see Configuring Private Google Access.
Proxy servers and the Backint agent
By default, the Backint agent bypasses any
HTTP proxy and does not read proxy environment variables in the operating
system, such as http_proxy
, https_proxy
, or no_proxy
.
If you have no alternative or your organization understands the performance implications and has the expertise that is required to support the performance of routing backups through a proxy server, you can configure the Backint agent to use a proxy.
The proxy settings for the Backint agent are contained
in the net.properties
file:
/usr/sap/SID/SYS/global/hdb/opt/backint/backint-gcs/jre/conf/net.properties
Bypassing a proxy server for backups and recoveries
Although the Backint agent bypasses proxy servers by default, you
can make the bypass explicit by specifying the required Google Cloud
domain names and IP addresses on the http.nonProxyHosts
parameter in
the /usr/sap/SID/SYS/global/hdb/opt/backint/backint-gcs/jre/conf/net.properties
file. For example:
http.nonProxyHosts=localhost|127.|[::1]|.googleapis.com|169.254.169.254|metadata.google.internal
Using a proxy server for backups and recoveries
To configure the Backint agent to send backups through
a proxy server, specify the proxy host and port number
parameters in the file
/usr/sap/SID/SYS/global/hdb/opt/backint/backint-gcs/jre/conf/net.properties
.
For queries to the VM instance metadata, the Backint agent
cannot use a proxy, so you must specify the domain name and IP address
for instance metadata on the http.nonProxyHosts
parameter.
The following example shows a valid proxy configuration for the Backint agent:
http.proxyHost=proxy-host http.proxyPort=proxy-port http.nonProxyHosts=localhost|127.*|[::1]|169.254.169.254|metadata.google.internal https.proxyHost=proxy-host https.proxyPort=proxy-port
Updates for the Backint agent
Google Cloud periodically releases new versions of the Backint agent that you can download and install yourself at no additional cost.
Before you update the Backint agent to a new version in your production environment, make sure to test the new version in a non-production environment.
Updating the Backint agent requires the SAP
HANA host to support remote HTTP requests to https://www.googleapis.com/
.
To update an existing instance of the Backint agent to a new version, see Updating the Backint agent to a new version.
Encryption for backups
Cloud Storage always encrypts your data before it is written to disk. To apply your own additional layer of encryption, you can provide your own encryption keys for the server-side encryption of your Backint agent backups.
You have two options for providing your own keys with Backint agent:
To use a customer-managed encryption key, specify the path to the key on
the #KMS_KEY_NAME
parameter in the parameters.txt
file.
You also need to
give the VM or Backint agent service account access to the key.
For more information about giving a service account access to an encryption
key, see Assigning a Cloud KMS key to a service account.
To use a customer-supplied encryption key, specify the path to the key on
the #ENCRYPTION_KEY
parameter in the parameters.txt
file. The key must be
a base64 encoded AES-256 key string, as described in
Customer-supplied encryption keys.
Encryption restrictions
The following restrictions apply to the encryption feature:
If both
#KMS_KEY_NAME
and#ENCRYPTION_KEY
are specified, the Backint agent fails and exits with a status of 1.If
#PARALLEL_FACTOR
is specified with either#KMS_KEY_NAME
or#ENCRYPTION_KEY
, the Backint agent fails and exits with a status of 1.
Configuration parameter reference
You can specify a number of options for the Backint agent
in the parameters.txt
configuration file.
When you first download the Backint agent, the
parameters.txt
file contains only two parameters:
#BUCKET
#DISABLE_COMPRESSION
Note that the #
is part of the parameter, and not a comment indicator.
Specify each parameter on a new line. Separate parameters and values with a space.
The Backint agent configuration parameters are shown in the following table.
Parameter and value | Description |
---|---|
#BUCKET bucket-name |
A required parameter that specifies the name of the Cloud Storage bucket that the Backint agent writes to and reads from. The Backint agent creates backup objects with the storage class of the bucket and supports all storage classes. The Backint agent uses Compute Engine default encryption to encrypt data at rest. |
#CHUNK_SIZE_MB MB |
Advanced tuning parameter.
Controls the size of HTTPS requests to Cloud Storage during backup or restore operations. The default chunk size is 100 MB, which means that a single HTTP request stream to or from Cloud Storage is kept open until 100 MB of data is transferred. Do not modify this setting unless instructed to do so by Customer Care. The default setting, which balances throughput and reliability, rarely needs to be changed. Because Backint agent retries failed HTTP requests multiple times before failing an operation, smaller chunk sizes result in less data that needs to be retransmitted if a request fails. Larger chunk sizes can improve throughput, but require more memory usage and more time to resend data in the event of a request failure. |
#DISABLE_COMPRESSION |
Optional parameter that disables the default, on-the-fly compression when
Backint agent writes backups to the Cloud Storage
bucket.
Specifying Regardless of this setting, the Backint agent supports either compressed or uncompressed backup files during a restore operation. |
#ENCRYPTION_KEY path/to/key/file |
Specifies a path to a customer-supplied encryption key that
Cloud Storage uses to encrypt backups. The path
must be specified as a fully qualified path to a base64-encoded
AES-256 key.
You cannot specify #ENCRYPTION_KEY with #KMS_KEY_NAME or #PARALLEL_FACTOR. For more information about using your own encryption keys on Google Cloud, see Customer-supplied encryption keys |
#KMS_KEY_NAME path/to/key/file |
Specifies a path to a customer-managed encryption key that
is generated by Cloud Key Management Service. Cloud Storage uses this key
to encrypt backups.
If SAP HANA is running on a Compute Engine VM, the key must be accessible to the VM. If SAP HANA is not running on Google Cloud, the Cloud KMS key must be linked to the Backint agent service account. For information, see Service accounts.
Specify the path by using the following format:
Where:
You cannot specify #KMS_KEY_NAME with #ENCRYPTION_KEY or #PARALLEL_FACTOR. For more information about managing your own encryption keys on Google Cloud, see Customer-managed encryption keys |
#MAX_GCS_RETRY integer |
Defines the maximum number of times the Backint agent retries a failed attempt to read and write to Cloud Storage. The default is 5, which is the recommended value. |
#PARALLEL_FACTOR integer |
Optional parameter that enables parallel upload and sets the maximum number of parallel uploads. A value of `1` disables parallel uploads. The default is `1`. Do not enable parallel upload if:
|
#PARALLEL_PART_SIZE_MB integer |
Advanced tuning parameter.
Sets the size, in MB, of each part that is uploaded in parallel. The default is 128 MB. Do not modify this setting unless instructed to do so by Customer Care. The default setting rarely needs to be changed. |
#RATE_LIMIT_MB integer |
Optional parameter that sets an upper limit, in MB, on the outbound bandwidth to Compute Engine during backup or restore operations. By default, Google Cloud does not limit network bandwidth for the Backint agent. When set, throughput might vary, but will not exceed the specified limit. |
#SERVICE_ACCOUNT path/to/key/file |
Optional parameter that specifies the fully-qualified path to the
JSON-encoded Google Cloud service account key when
Compute Engine default authentication is not used. Specifying
#SERVICE_ACCOUNT directs the Backint agent to
use the key when authenticating to the Cloud Storage service.
The Compute Engine default authentication is recommended. |
#THREADS integer |
Advanced tuning parameter.
Sets the number of worker threads. The default is the number of processors in the machine. Do not modify this setting unless instructed to do so by Customer Care. The default setting rarely needs to be changed. |
#READ_IDLE_TIMEOUT integer |
Advanced tuning parameter.
Sets the maximum amount of time in milliseconds that the Backint agent will wait to open the backup file. The default is 1000. Do not modify this setting unless instructed to do so by Customer Care. The default setting rarely needs to be changed. |
#HTTP_READ_TIMEOUT integer |
Advanced tuning parameter.
Sets the timeout in milliseconds for reading responses from the Cloud Storage API requests. The default is -1; no timeout. Do not modify this setting unless instructed to do so by Customer Care. The default setting rarely needs to be changed. |
Logging for the Backint agent
In addition to the logs kept by SAP HANA in backup.log
, the
Backint agent writes operational and communication-error events to
log files in the logs
subdirectory in
/usr/sap/SID/SYS/global/hdb/opt/backint/backint-gcs
.
When the size of a log file reaches 10 MB, the Backint agent rotates the log files.
If necessary, you can edit the Backint agent logging configuration in
/usr/sap/SID/SYS/global/hdb/opt/backint/backint-gcs/logging.properties
.
The Backint agent also supports Cloud Logging. To enable Cloud Logging, see the Cloud Storage Backint agent for SAP HANA installation guide.
Using the Backint agent in SAP HANA HA deployments
In an SAP HANA high-availability cluster, you need to install the Backint agent on each node in the cluster.
Use the same Backint agent configuration with the same Cloud Storage bucket specifications for each SAP HANA instance in the HA cluster. You can use the same bucket specifications because, during normal operations, only the active SAP HANA instance in an HA configuration writes backups to Cloud Storage. The secondary system is in replication mode. This is true for data, log, and catalog backups.
Further, application clustering software, such as Pacemaker, prevents split-brain scenarios, in which more than one SAP HANA system in a cluster thinks that it is the primary instance.
However, during maintenance activities, when clustering might be disabled, if the standby database is removed from replication and brought back online, you need to make sure that backups are triggered only on the primary database.
Because the Backint agent is unaware of which SAP HANA system is currently the active system and has no scheduling or triggering mechanisms, you need to manage the scheduling and backup triggers by using SAP mechanisms, such as the SAP ABAP transaction DB13.
SAP ABAP applications connect to the HA cluster through the virtual IP, so the trigger is always routed to the active SAP HANA instance.
If the backup trigger is defined locally on each server, for example as a local operating system script, and both the primary and secondary systems think they are the active system, they both might attempt to write backups to the storage bucket.
Using the Backint agent in SAP HANA DR deployments
In a disaster recovery configuration, where a recovery instance of SAP HANA in another Google Cloud region is kept in sync by using asynchronous SAP HANA System Replication, specify a different bucket for the recovery instance than the primary SAP HANA system uses.
While the DR system is usually in replication mode and therefore cannot run a backup itself, during regular disaster recovery testing, the recovery instance is brought online and could trigger backups. If it does and the recovery system doesn't use a separate bucket, the backups might overwrite data from the primary database.
In the case of an actual disaster that requires you to recover from a backup to your DR region, you can update the Backint agent configuration to reference the multi-regional bucket that your primary HA system uses.
Using the Backint agent in SAP HANA scale-out systems
In SAP HANA scale-out systems, you need to install the Backint agent on each node in the system.
To simplify the management of the parameters.txt
configuration file and,
if you are using one, the Backint agent
service account key, you can place these files in a shared NFS directory.