Custom Metrics from the Agent

This guide explains how you can configure the Stackdriver Monitoring agent to recognize and export your application metrics to Stackdriver.

The Stackdriver Monitoring agent is a collectd daemon. In addition to exporting many predefined system and third-party metrics to Stackdriver, the agent can export your own collectd application metrics to Stackdriver as custom metrics. Your collectd plugins can also export to Stackdriver.

An alternative way to export application metrics to Stackdriver is to use StatsD. Stackdriver Monitoring provides a default configuration that maps StatsD metrics to custom metrics. If you are satisfied with that mapping, you do not need the customization steps described below. For more information, see the StatsD plugin.

If you are unfamiliar with Stackdriver's custom metrics, see Metrics, Time Series, and Resources and Using Custom Metrics.

Before you begin

  • The Stackdriver Monitoring agent is fully available to GCP projects in the Stackdriver Premium service tier. If you opt-in for metric overage charges, Basic service tier projects can receive custom metrics from the agent. If you create a new Stackdriver account, you'll get a 30-day free trial of the Premium Tier.

  • Install the most recent Stackdriver Monitoring agent on a VM instance and verify it is working. To update your agent, see Updating the agent.

  • Configure collectd to get monitoring data from your application. Collectd supports many application frameworks and standard monitoring endpoints through its read plugins. Find a read plugin that works for you.

  • (Optional) As a convenience, add the agent's collectd reference documentation to your system's man pages by updating the MANPATH variable and then running mandb:

    export MANPATH="$MANPATH:/opt/stackdriver/collectd/share/man"
    sudo mandb
    

    The man pages are for stackdriver-collectd.

Important files and directories

The following files and directories, created by installing the agent, are relevant to configuring the Stackdriver Monitoring agent (collectd):

/opt/stackdriver/collectd/etc/collectd.conf
The collectd configuration file used by the agent, generated from collectd-gcm.conf.tmpl at init. You should not have to change this file.
/opt/stackdriver/collectd/etc/collectd.d/
The directory for user-added configuration files. To send custom metrics from the agent, you should place the required configuration files, discussed below, in this directory.
/opt/stackdriver/collectd/share/man/*
The documentation for the agent's version of collectd. You should install all the man pages where you can easily reference them.
/etc/init.d/stackdriver-agent
The init script for the agent.

How Stackdriver Monitoring handles collectd metrics

As background, the agent processes collectd metrics and sends them to Stackdriver Monitoring in the following ways:

  1. Custom metrics. Collectd metrics that have the metadata key stackdriver_metric_type and a single data source are handled as custom metrics and sent to Stackdriver Monitoring using the projects.timeSeries.create API method.

  2. Curated metrics. All other collectd metrics are sent to Stackdriver Monitoring using the projects.collectdTimeSeries.create API method. The metrics in the list of curated metrics are accepted and processed.

  3. Discarded metrics. Collectd metrics that are not in the curated metrics list and are not custom metrics are silently discarded by Stackdriver Monitoring. The agent itself is not aware of which metrics are accepted or discarded.

Writing custom metrics with the agent

You are configuring the agent to send metric data points to Stackdriver Monitoring. Each point must be associated with a custom metric that you define with a metric descriptor. These concepts are described in detail at Metrics, Time Series, and Resources and Custom Metrics.

You can have a collectd metric treated as a custom metric by adding the proper metadata to the metric:

  • stackdriver_metric_type : (required) the name of the exported metric. Example: custom.googleapis.com/my_custom_metric.

  • label:[LABEL] : (optional) additional labels for the exported metric. For example, if you want a Stackdriver Monitoring STRING label named color, then your metadata key would be label:color and the value of the key could be "blue". You can have up to 10 labels per metric type.

You can use a collectd filter chain to modify the metadata for your metrics. Because filter chains cannot modify the list of data sources and custom metrics only support a single data source, any collectd metrics that you want to use with this facility must have a single data source.

Example

In this example we will monitor active Nginx connections from two Nginx services, my_service_a and my_service_b. We will send these to Stackdriver Monitoring using a custom metric. We will take the following steps:

  1. Identify the collectd metrics for each Nginx service.

  2. Define a Stackdriver Monitoring metric descriptor.

  3. Configure a collectd filter chain to add metadata to the collectd metrics, to meet the expectations of the Stackdriver Monitoring agent.

Incoming collectd metrics

Collectd expects metrics to consist of the following components. The first five components make up the collectd identifier for the metric:

    Host, Plugin, Plugin-instance, Type, Type-instance, [value]

In this example, the metrics you want to send as a custom metric have the following values:

Component Expected value(s)
Host any
Plugin curl_json
Plugin instance nginx_my_service_a or
nginx_my_service_b1
Type gauge
Type instance active-connections
[value] any value2

Notes:
1 In the example, this value encodes both the application (Nginx) and the connected service name.
2 The value is typically a timestamp and double-precision number. Stackdriver Monitoring handles the details of interpreting the various kinds of values. Compound values are not currently supported by the Stackdriver Monitoring agent.

Stackdriver Monitoring metric descriptor and time series

On the Stackdriver Monitoring side, design a metric descriptor for your custom metric. The following descriptor is a reasonable choice for the data in this example:

  • Name: custom.googleapis.com/nginx/active_connections
  • Labels:
    • service_name (STRING): The name of the service connected to Nginx.
  • Kind: GAUGE
  • Type: DOUBLE

Once you've designed the metric descriptor, you can create it using projects.metricDescriptors.create or you can let it be created for you from the time series metadata, discussed below. For more information, see Creating metric descriptors on this page.

The time series data for this metric descriptor must contain the following information, because of the way the metric descriptor is defined:

  • Metric name: custom.googleapis.com/nginx/active_connections
  • Metric label values:
    • service_name: either "my_service_a" or "my_service_b"

Other time series information, including the associated monitored resource—the VM instance sending the data—and the metric's data point, is automatically obtained by the agent for all metrics. You don't have to do anything special.

Your filter chain

Create a file, /opt/stackdriver/collectd/etc/collectd.d/nginx_curl_json.conf, containing the following code:

LoadPlugin match_regex
LoadPlugin target_set
LoadPlugin target_replace

# Insert a new rule in the default "PreCache" chain, to divert your metrics.
PreCacheChain "PreCache"
<Chain "PreCache">
  <Rule "jump_to_custom_metrics_from_curl_json">
    # If the plugin name and instance match, this is PROBABLY a metric we're looking for:
    <Match regex>
      Plugin "^curl_json$"
      PluginInstance "^nginx_"
    </Match>
    <Target "jump">
      # Go execute the following chain; then come back.
      Chain "PreCache_curl_json"
    </Target>
  </Rule>
  # Continue processing metrics in the default "PreCache" chain.
</Chain>

# Following is a NEW filter chain, just for your metric.
# It is only executed if the default chain "jumps" here.
<Chain "PreCache_curl_json">

  # The following rule does all the work for your metric:
  <Rule "rewrite_curl_json_my_special_metric">
    # Do a careful match for just your metrics; if it fails, drop down
    # to the next rule:
    <Match regex>
      Plugin "^curl_json$"                   # Match on plugin.
      PluginInstance "^nginx_my_service_.*$" # Match on plugin instance.
      Type "^gauge$"                         # Match on type.
      TypeInstance "^active-connections$"    # Match on type instance.
    </Match>

    <Target "set">
      # Specify the metric descriptor name:
      MetaData "stackdriver_metric_type" "custom.googleapis.com/nginx/active_connections"
      # Specify a value for the "service_name" label; clean it up in the next Target:
      MetaData "label:service_name" "%{plugin_instance}"
    </Target>

    <Target "replace">
      # Remove the "nginx_" prefix in the service_name to get the real service name:
      MetaData "label:service_name" "nginx_" ""
    </Target>
  </Rule>

  # The following rule is run after rewriting your metric, or
  # if the metric wasn't one of your custom metrics. The rule returns to
  # the default "PreCache" chain. The default processing
  # will write all metrics to Stackdriver Monitoring,
  # which will drop any unrecognized metrics: ones that are not
  # in the list of curated metrics and do not have
  # the custom metric metadata.
  <Rule "go_back">
    Target "return"
  </Rule>
</Chain>

Load the new configuration

Restart your agent to pick up the new configuration by executing the following command on your VM instance:

sudo service stackdriver-agent restart

Your custom metric information will begin to flow into Stackdriver Monitoring.

Reference and best practices

Metric descriptors and time series

For an introduction to Stackdriver metrics, see Metrics, Time Series, and Resources. More details are available in Custom Metrics.

Metric descriptors. A metric descriptor has the following significant pieces:

  • A name of the form custom.googleapis.com/[NAME1]/.../[NAME0]. For example:

    custom.googleapis.com/my_measurement
    custom.googleapis.com/instance/network/received_packets_count
    custom.googleapis.com/instance/network/sent_packets_count
    

    Custom metric names must begin with custom.googleapis.com/. The recommended naming is hierarchical to make the metrics easier for people to keep track of. Metric names cannot contain hyphens; for the exact naming rules, see Rules for metric and label names.

  • Up to 10 labels to annotate the metric data, such as device_name, fault_type, or response_code. The values of the labels are not specified in the metric descriptor.

  • The kind and type of the data points, such as "a gauge value of type double". For more information, see Metric kinds and value types.

Time series. A metric data point has the following significant pieces:

  • The name of the associated metric descriptor.

  • Values for all of the metric descriptor's labels.

  • A timestamped value consistent with the metric descriptor's type and kind.

  • The monitored resource the data came from, typically a VM instance. Space for the resource is built in, so the descriptor doesn't need a separate label for it.

Creating metric descriptors

You don't have to create a metric descriptor ahead of time. When a data point arrives in Stackdriver Monitoring, the point's metric name, labels, and the point's value can be used to automatically create a gauge or cumulative metric descriptor. For more information, see Auto-creation of custom metrics.

However, there are advantages to creating your own metric descriptor:

  • You can include some thoughtful documentation for the metric and its labels.

  • You can specify additional kinds and types of metrics. The only (kind, type) combinations supported by the agent are (GAUGE, DOUBLE) and (CUMULATIVE, INT64). For more information, see Metric kinds and value types.

  • You can specify label types other than STRING.

If you write a data point to Stackdriver Monitoring that uses a metric name that is not defined, then a new metric descriptor will be created for the data point. This can be a problem if you are debugging the code that writes metric data—misspelling the metric name will result in spurious metric descriptors.

After you create a metric descriptor, or after it is created for you, it cannot be changed. For example, you can't add or remove labels. You can only delete the metric descriptor—which deletes all its data—and then recreate the descriptor the way you want.

For more details about creating metric descriptors, see Defining your metric.

Metric descriptor cost

GCP has over a thousand built-in metrics, but there are limits and charges associated with creating your own custom metric descriptors. You should be thinking about having a few, or a few dozen of them—not a few thousand. A single metric descriptor can handle a lot of data:

  • A single descriptor can have data from any number of VM instances, or other sources like databases, load balancers, etc.

  • A single descriptor can have up to 10 labels, and a single label can have any number of different values.

  • A single descriptor can have thousands of time series associated with it. There is a separate time series for each unique combination of values for the VM instance and metric labels. There is an upper limit on the number of time series. For more information, see Quota Policy.

If you accidentally create too many metric descriptors while you are developing your configurations, you can find and delete the descriptors using the Stackdriver Monitoring API. For more information, see projects.metricDescriptors.

For more pricing information, see Stackdriver pricing. For quota information, see Quotas.

Troubleshooting

This section explains how to configure the Stackdriver Monitoring agent's write_log plugin to dump out the full set of metric points, including metadata. This can be used to determine what points need to be transformed, as well as to ensure your transformations behave as expected.

Enabling write_log

The write_log plugin is included in the stackdriver-agent package (version 5.5.2-356 and above). To enable the plugin, edit /opt/stackdriver/collectd/etc/collectd-gcm.conf.tmpl as root:

  1. Right after LoadPlugin write_gcm, add:

    LoadPlugin write_log
    
  2. Right after <Plugin "write_gcm">…</Plugin>, add:

    <Plugin "write_log">
      Format JSON
    </Plugin>
    
  3. Search for <Target "write">…</Target> and after every Plugin "write_gcm", add:

    Plugin "write_log"
    
  4. Save your changes and restart the agent:

    sudo service stackdriver-agent restart
    

These changes will print one log line per metric value reported, including the full collectd identifier, the metadata entries, and the value.

Output of write_log

If you were successful in the previous step, you should see the output of write_log in the system logs:

  • Debian-based Linux: /var/log/syslog
  • Red Hat-based Linux: /var/log/messages

The sample lines below have been formatted to make them easier to read in this document.

Dec  8 15:13:45 test-write-log collectd[1061]: write_log values:#012[{
    "values":[1933524992], "dstypes":["gauge"], "dsnames":["value"],
    "time":1481210025.252, "interval":60.000,
    "host":"test-write-log.c.test-write-log.internal",
    "plugin":"df", "plugin_instance":"udev", "type":"df_complex", "type_instance":"free"}]

Dec  8 15:13:45 test-write-log collectd[1061]: write_log values:#012[{
    "values":[0], "dstypes":["gauge"], "dsnames":["value"],
    "time":1481210025.252, "interval":60.000,
    "host":"test-write-log.c.test-write-log.internal",
    "plugin":"df", "plugin_instance":"udev", "type":"df_complex", "type_instance":"reserved"}]

Send feedback about...

Stackdriver Monitoring