Process monitoring using Google Cloud's Agent for SAP

This planning guide focuses solely on the Process Monitoring metrics collection feature of Google Cloud's Agent for SAP. For information about the agent and all its features, see Google Cloud's Agent for SAP planning guide.

On Linux, Google Cloud's Agent for SAP can help you monitor the processes in your SAP applications and their runtime states. This is delivered through the collection of Process Monitoring metrics, which you can enable after installing the agent on your Compute Engine VM instances or Bare Metal Solution servers.

The information collected in the Process Monitoring metrics helps you troubleshoot the issues related to your SAP system. In case of issues, with the help of Process Monitoring metrics, Cloud Customer Care can help you reach a resolution more efficiently. The data collected using Process Monitoring metrics provide observability for your SAP HANA high-availability cluster configurations.

For information about how to configure Google Cloud's Agent for SAP to collect the Process Monitoring metrics, see Configure Process Monitoring metrics collection.

Types of Process Monitoring metrics

From version 2.6 of Google Cloud's Agent for SAP, the Process Monitoring metrics collected by the agent are referred to as follows:

  • Fast-changing metrics: This includes sap/hana/availability, sap/hana/ha/availability, and sap/nw/availability. These metrics are collected at a default frequency of 5 seconds. This collection frequency can be updated using the configuration parameter process_metrics_frequency.
  • Slow-changing metrics: Process Monitoring metrics other than the fast-changing ones are referred to as slow-changing. These metrics are collected at a default frequency of 30 seconds. This collection frequency can updated using the configuration parameter slow_process_metrics_frequency.

Cloud Monitoring pricing

The Process monitoring metrics that Google Cloud's Agent for SAP collects and sends to Monitoring are classified by Monitoring as chargeable metrics and priced by ingested volume.

The frequency at which the agent queries your SAP systems to collect the Process Monitoring metrics affects the volume of metrics that get sent to Monitoring.

Process Monitoring metrics are fast-changing metrics that are collected every 5 seconds by default.

For more information about Monitoring pricing, see Google Cloud Observability pricing.

Sample cost estimate

To see a sample cost estimate for the collection of the Process Monitoring metrics using Google Cloud's Agent for SAP, see Pricing example for metrics charged by bytes ingested.

Process Monitoring metrics

The following table describes the Process Monitoring metrics collected by Google Cloud's Agent for SAP. The metric strings in this table must be prefixed with workload.googleapis.com/. This prefix has been omitted from the entries in the following table.

Metric Category Description
sap/hana/service SAP HANA Numeric response code for SAP HANA service availability.
  • 0: Service is not running
  • 1: Service is running
sap/hana/ha/replication SAP HANA Numeric response code for SAP HANA system replication, based on SAP System ID, SAP Instance Number, and SAP Service Name.
  • 0: Error occurred.
  • 10: Replication is off (standalone mode).
  • 12: Replication is active. The current node is the secondary node.
  • 15: Replication is active. Initialization or sync with the primary is complete and the secondary is continuously replicating.
sap/hana/availability SAP HANA Numeric response code for SAP HANA system availability, based on SAP System ID and SAP Instance Number.
  • 0: One or more processes are not active
  • 1: All processes are active
sap/hana/ha/availability SAP HANA Numeric response code for SAP HANA system high availability state, based on SAP system ID and SAP Instance Number.
  • 0: Unknown state
  • 1: Current node is secondary
  • 2: Primary node has error
  • 3: Primary node is online but replication is not fully functional
  • 4: Primary node is online with replication running
sap/hana/query/state SAP HANA Numeric response code that represents the health check of SAP HANA based on the query select * from dummy. The value 0 indicates success. Any other value indicates failures.
sap/hana/query/overalltime SAP HANA Reported only if query/state is 0. This is the overall time taken by the query, including client side time and server side time, in microseconds.
sap/hana/query/servertime SAP HANA Reported only if query/state is 0. This is the time taken by the server to process the query, in microseconds.
sap/cluster/failcounts SAP HANA The failcount value of the Linux HA resources. If the resource is not present, then there is no failcount registered. Otherwise, the cluster monitoring crm_mon reports the number of failed actions.
sap/cluster/nodes Pacemaker Cluster Numeric response code that indicates the state of the Linux HA cluster state.
  • -10: Unknown
  • -1: Unclean state
  • 0: Shutdown
  • 1: Standby
  • 2: Online
sap/cluster/resources Pacemaker Cluster Numeric response code that indicates if the Linux HA cluster resource is up and running.
  • -10: Unknown
  • 0: Failed
  • 1: Stopped
  • 2: Starting
  • 3: Resource is in one of the following steady states: Master, Slave, or Started
sap/nw/availability SAP NetWeaver Numeric response code for SAP NetWeaver system availability, based on SAP System ID, SAP Instance Number, and SAP Service Name.
  • 0: Unknown state
  • 1: Current node is active or up
sap/nw/service SAP NetWeaver Numeric response code for SAP NetWeaver service availability, based on SAP System ID, SAP Instance Number, and SAP Service Name.
  • 0: Service is not running
  • 1: Service is running
sap/nw/icm/rcode SAP NetWeaver Response code based on the HTTP 1.1 protocol of a non-authenticated ICM URL resource (local call).
sap/nw/icm/rtime SAP NetWeaver Response time in milliseconds of a non-authenticated ICM URL resource (local call).
sap/nw/ms/rcode SAP NetWeaver Response code based on the HTTP 1.1 protocol of a non-authenticated Message Server URL resource (local call).
sap/nw/ms/rtime SAP NetWeaver Response time in milliseconds of a non-authenticated Message Server URL resource (local call).
sap/nw/ms/wp SAP NetWeaver Number of the ABAP work processes (NW ABAP) or Java server nodes (NW Java) reported by the Message Server information page.
sap/nw/abap/proc/busy SAP NetWeaver Number of the busy ABAP work processes by type, such as DIA, ICM, and DISP.
sap/nw/abap/proc/count SAP NetWeaver Number of the all ABAP work processes by type, such as DIA, ICM, and DISP.
sap/nw/abap/queue/current SAP NetWeaver The current number of ABAP queues used by the ABAP work processes, grouped by the work process types such as DIA, ICM, and DISP.
sap/nw/abap/queue/peak SAP NetWeaver The peak number of ABAP queues used by the ABAP work processes, grouped by the work process types such as DIA, ICM, and DISP.
sap/nw/abap/sessions SAP NetWeaver Number of the ABAP sessions by session type.
sap/nw/abap/rfc SAP NetWeaver Number of the ABAP RFC connections by session type.
sap/nw/enq/locks/usercountowner SAP NetWeaver Number of enqueue locks in SAP NetWeaver systems. If your system has a lot of open lock entries, then it can lead to performance issues for your users.
sap/mntmode Additional SAP metrics Maintenance mode of the corresponding SAP System ID (SID) that has been set manually to indicate that the system is intentionally down (maintenancemode = TRUE). The value of this metric is used to suppress the alerts for the systems that are unavailable during planned maintenance.

To notify the agent if a particular SID is undergoing planned maintenance, run the following command:

google_cloud_sap_agent maintenance \
    --enable=TRUE or FALSE \
    --sid=SID
sap/service/is-failed Additional SAP metrics Indicates if the OS services related to SAP and cluster services failed. The exit code 0 represents a failure.
sap/service/is-disabled Additional SAP metrics This metric is populated when the pacemaker, corosync, sapconf, saptune, and sapinit services are not enabled.
sap/hana/cpu/utilization Additional SAP metrics Per-process CPU utilization (%) of SAP HANA processes.
sap/nw/cpu/utilization Additional SAP metrics Per-process CPU utilization (%) of SAP NetWeaver processes.
sap/control/cpu/utilization Additional SAP metrics Per-process CPU utilization (%) of SAP Control processes.
sap/hana/memory/utilization Additional SAP metrics Per-process memory utilization (MB) of the HANA processes.
sap/nw/memory/utilization Additional SAP metrics Per-process memory utilization (MB) of the NetWeaver processes.
sap/control/memory/utilization Additional SAP metrics Per-process memory utilization (MB) of the SAP Control processes.
sap/hana/iops/reads Additional SAP metrics Per-process read IOPS for SAP HANA processes.
sap/hana/iops/writes Additional SAP metrics Per-process write IOPS for SAP HANA processes.
sap/nw/iops/reads Additional SAP metrics Per-process read IOPS for SAP NetWeaver processes.
sap/nw/iops/writes Additional SAP metrics Per-process write IOPS for SAP NetWeaver processes.
sap/infra/migration Google Cloud infrastructure metrics Indicates if a VM instance is undergoing a live migration.
sap/pacemaker Additional SAP metrics Numeric response code that conveys if the host includes a Pacemaker configuration.
  • 0: No Pacemaker configuration found
  • 1: Pacemaker configuration found
sap/hana/volumes Additional SAP metrics Exposes the following information about the mounted SAP HANA volumes: total size of the volume, used storage, available storage, and storage usage percentage.
sap/networkstats/rtt Additional SAP metrics The average round trip time, in milliseconds.

This metric contains TCP connection information related to your SAP HANA system. This metric is collected for sockets of SAP HANA hdbnameserver process using the ss utility.

sap/networkstats/rcv_rtt Additional SAP Metrics The time taken by the remote client to exhaust the current advertized remote receive window (RWIN) if no userspace consumption of that data has occurred. It is based on the observed bandwidth of the connection and returns a non-zero value.

This metric contains TCP connection information related to your SAP HANA system. This metric is collected for sockets of SAP HANA hdbnameserver process using the ss utility.

sap/networkstats/rto Additional SAP Metrics The TCP re-transmission timeout, in milliseconds.

This metric contains TCP connection information related to your SAP HANA system. This metric is collected for sockets of SAP HANA hdbnameserver process using the ss utility.

sap/networkstats/bytes_acked Additional SAP Metrics The number of bytes acknowledged.

This metric contains TCP connection information related to your SAP HANA system. This metric is collected for sockets of SAP HANA hdbnameserver process using the ss utility.

sap/networkstats/bytes_received Additional SAP Metrics The number of bytes received.

This metric contains TCP connection information related to your SAP HANA system. This metric is collected for sockets of SAP HANA hdbnameserver process using the ss utility.

sap/networkstats/lastsnd Additional SAP Metrics The time, in milliseconds, since the last packet was sent.

This metric contains TCP connection information related to your SAP HANA system. This metric is collected for sockets of SAP HANA hdbnameserver process using the ss utility.

sap/networkstats/lastrcv Additional SAP Metrics The time, in milliseconds, since the last packet was received.

This metric contains TCP connection information related to your SAP HANA system. This metric is collected for sockets of SAP HANA hdbnameserver process using the ss utility.

Viewing metrics in Monitoring

Google Cloud provides custom dashboards that help you visualize the Process Monitoring metrics collected by Google Cloud's Agent for SAP. See the dashboards/google-cloud-agent-for-sap directory in the GoogleCloudPlatform/monitoring-dashboard-samples repository on GitHub.

For information about these dashboards, including installation instructions, see View the collected metrics.

For information about finding metrics data in Monitoring and configuring alert notifications, see Metrics in Monitoring.