This guide explains how you can configure the Monitoring agent to recognize and export your application metrics to Cloud Monitoring.
The Monitoring agent is a collectd daemon. In addition to exporting many predefined system and third-party metrics to Cloud Monitoring, the agent can export your own collectd application metrics to Monitoring as custom metrics. Your collectd plugins can also export to Monitoring.
An alternative way to export application metrics to Monitoring is to use StatsD. Cloud Monitoring provides a default configuration that maps StatsD metrics to custom metrics. If you are satisfied with that mapping, you don't need the customization steps described below. For more information, see the StatsD plugin.
If you are unfamiliar with Monitoring's custom metrics, see Metrics, time series, and resources, Structure of time series, and Using custom metrics.
Before you begin
Install the most recent Monitoring agent on a VM instance and verify it is working. To update your agent, see Updating the agent.
Configure collectd to get monitoring data from your application. Collectd supports many application frameworks and standard monitoring endpoints through its read plugins. Find a read plugin that works for you.
(Optional) As a convenience, add the agent's collectd reference documentation to your system's
man
pages by updating theMANPATH
variable and then runningmandb
:export MANPATH="$MANPATH:/opt/stackdriver/collectd/share/man" sudo mandb
The man pages are for
stackdriver-collectd
.
Important files and directories
The following files and directories, created by installing the agent, are relevant to using the Monitoring agent (collectd):
/etc/stackdriver/collectd.conf
The collectd configuration file used by the agent. Edit this file to change general configuration.
/etc/stackdriver/collectd.d/
The directory for user-added configuration files. To send custom metrics from the agent, you place the required configuration files, discussed below, in this directory. For backward compatibility, the agent also looks for files in
/opt/stackdriver/collectd/etc/collectd.d/
./opt/stackdriver/collectd/share/man/*
The documentation for the agent's version of collectd. You can add these pages to your system's set of
man
pages; see Before you begin for details./etc/init.d/stackdriver-agent
The init script for the agent.
How Monitoring handles collectd metrics
As background, the Monitoring agent processes collectd metrics and sends them to Monitoring, which treats each metric as a member of one of the following categories:
Custom metrics. Collectd metrics that have the metadata key
stackdriver_metric_type
and a single data source are handled as custom metrics and sent to Monitoring using theprojects.timeSeries.create
method in the Monitoring API.Curated metrics. All other collectd metrics are sent to Monitoring using an internal API. Only the metrics in the list of curated metrics are accepted and processed.
Discarded metrics. Collectd metrics that aren't in the curated metrics list and aren't custom metrics are silently discarded by Monitoring. The agent itself isn't aware of which metrics are accepted or discarded.
Writing custom metrics with the agent
You are configuring the agent to send metric data points to Monitoring. Each point must be associated with a custom metric, which you define with a metric descriptor. These concepts are introduced in Metrics, time series, and resources and described in detail at Structure of time series and Using custom metrics.
You can have a collectd metric treated as a custom metric by adding the proper metadata to the metric:
stackdriver_metric_type
: (required) the name of the exported metric. Example:custom.googleapis.com/my_custom_metric
.label:[LABEL]
: (optional) additional labels for the exported metric. For example, if you want a Monitoring STRING label namedcolor
, then your metadata key would belabel:color
and the value of the key could be"blue"
. You can have up to 10 labels per metric type.
You can use a collectd filter chain to modify the metadata for your metrics. Because filter chains can't modify the list of data sources and custom metrics only support a single data source, any collectd metrics that you want to use with this facility must have a single data source.
Example
In this example we will monitor active Nginx connections from two Nginx
services, my_service_a
and my_service_b
. We will send these to
Monitoring using a custom metric. We will take the following steps:
Identify the collectd metrics for each Nginx service.
Define a Monitoring metric descriptor.
Configure a collectd filter chain to add metadata to the collectd metrics, to meet the expectations of the Monitoring agent.
Incoming collectd metrics
Collectd expects metrics to consist of the following components. The first five components make up the collectd identifier for the metric:
Host, Plugin, Plugin-instance, Type, Type-instance, [value]
In this example, the metrics you want to send as a custom metric have the following values:
Component | Expected value(s) |
---|---|
Host | any |
Plugin | curl_json |
Plugin instance | nginx_my_service_a ornginx_my_service_b 1 |
Type | gauge |
Type instance | active-connections |
[value] |
any value2 |
Notes:
1 In the example, this value encodes both the application (Nginx) and
the connected service name.
2 The value is typically a timestamp and double-precision number.
Monitoring handles the details of interpreting the various kinds of
values. Compound values aren't currently supported by the
Monitoring agent.
Monitoring metric descriptor and time series
On the Monitoring side, design a metric descriptor for your custom metric. The following descriptor is a reasonable choice for the data in this example:
- Name:
custom.googleapis.com/nginx/active_connections
- Labels:
service_name
(STRING): The name of the service connected to Nginx.
- Kind: GAUGE
- Type: DOUBLE
Once you've designed the metric descriptor, you can create it
using projects.metricDescriptors.create
or you can let it be created for you
from the time series metadata, discussed below. For more information,
see Creating metric descriptors on this page.
The time series data for this metric descriptor must contain the following information, because of the way the metric descriptor is defined:
- Metric name:
custom.googleapis.com/nginx/active_connections
- Metric label values:
service_name
: either"my_service_a"
or"my_service_b"
Other time series information, including the associated monitored resource—the VM instance sending the data—and the metric's data point, is automatically obtained by the agent for all metrics. You don't have to do anything special.
Your filter chain
Create a file, /opt/stackdriver/collectd/etc/collectd.d/nginx_curl_json.conf
,
containing the following code:
LoadPlugin match_regex
LoadPlugin target_set
LoadPlugin target_replace
# Insert a new rule in the default "PreCache" chain, to divert your metrics.
PreCacheChain "PreCache"
<Chain "PreCache">
<Rule "jump_to_custom_metrics_from_curl_json">
# If the plugin name and instance match, this is PROBABLY a metric we're looking for:
<Match regex>
Plugin "^curl_json$"
PluginInstance "^nginx_"
</Match>
<Target "jump">
# Go execute the following chain; then come back.
Chain "PreCache_curl_json"
</Target>
</Rule>
# Continue processing metrics in the default "PreCache" chain.
</Chain>
# Following is a NEW filter chain, just for your metric.
# It is only executed if the default chain "jumps" here.
<Chain "PreCache_curl_json">
# The following rule does all the work for your metric:
<Rule "rewrite_curl_json_my_special_metric">
# Do a careful match for just your metrics; if it fails, drop down
# to the next rule:
<Match regex>
Plugin "^curl_json$" # Match on plugin.
PluginInstance "^nginx_my_service_.*$" # Match on plugin instance.
Type "^gauge$" # Match on type.
TypeInstance "^active-connections$" # Match on type instance.
</Match>
<Target "set">
# Specify the metric descriptor name:
MetaData "stackdriver_metric_type" "custom.googleapis.com/nginx/active_connections"
# Specify a value for the "service_name" label; clean it up in the next Target:
MetaData "label:service_name" "%{plugin_instance}"
</Target>
<Target "replace">
# Remove the "nginx_" prefix in the service_name to get the real service name:
MetaData "label:service_name" "nginx_" ""
</Target>
</Rule>
# The following rule is run after rewriting your metric, or
# if the metric wasn't one of your custom metrics. The rule returns to
# the default "PreCache" chain. The default processing
# will write all metrics to Cloud Monitoring,
# which will drop any unrecognized metrics: ones that aren't
# in the list of curated metrics and don't have
# the custom metric metadata.
<Rule "go_back">
Target "return"
</Rule>
</Chain>
Load the new configuration
Restart your agent to pick up the new configuration by executing the following command on your VM instance:
sudo service stackdriver-agent restart
Your custom metric information will begin to flow into Monitoring.
Reference and best practices
Metric descriptors and time series
For an introduction to Cloud Monitoring metrics, see Metrics, time series, and resources. More details are available in Using custom metrics and Structure of time series.
Metric descriptors. A metric descriptor has the following significant pieces:
A name of the form
custom.googleapis.com/[NAME1]/.../[NAME0]
. For example:custom.googleapis.com/my_measurement custom.googleapis.com/instance/network/received_packets_count custom.googleapis.com/instance/network/sent_packets_count
Custom metric names must begin with
custom.googleapis.com/
. The recommended naming is hierarchical to make the metrics easier for people to keep track of. Metric names can't contain hyphens; for the exact naming rules, see Naming metric types and labels.Up to 10 labels to annotate the metric data, such as
device_name
,fault_type
, orresponse_code
. The values of the labels aren't specified in the metric descriptor.The kind and type of the data points, such as "a gauge value of type double". For more information, see
MetricKind
andValueType
.
Time series. A metric data point has the following significant pieces:
The name of the associated metric descriptor.
Values for all of the metric descriptor's labels.
A timestamped value consistent with the metric descriptor's type and kind.
The monitored resource the data came from, typically a VM instance. Space for the resource is built in, so the descriptor doesn't need a separate label for it.
Creating metric descriptors
You don't have to create a metric descriptor ahead of time. When a data point arrives in Monitoring, the point's metric name, labels, and the point's value can be used to automatically create a gauge or cumulative metric descriptor. For more information, see Auto-creation of custom metrics.
However, there are advantages to creating your own metric descriptor:
You can include some thoughtful documentation for the metric and its labels.
You can specify additional kinds and types of metrics. The only (kind, type) combinations supported by the agent are (GAUGE, DOUBLE) and (CUMULATIVE, INT64). For more information, see Metric kinds and value types).
You can specify label types other than STRING.
If you write a data point to Monitoring that uses a metric name that isn't defined, then a new metric descriptor will be created for the data point. This can be a problem if you are debugging the code that writes metric data—misspelling the metric name will result in spurious metric descriptors.
After you create a metric descriptor, or after it is created for you, it cannot be changed. For example, you can't add or remove labels. You can only delete the metric descriptor—which deletes all its data—and then recreate the descriptor the way you want.
For more details about creating metric descriptors, see Creating your metric.
Costs and limits
Cloud Monitoring charges for collectd metrics—both user-defined and agent metrics—based on the volume of metric data received. For details, see Pricing for Google Cloud's operations suite.
Apart from pricing, Cloud Monitoring has limits on the number of metric time series and the number of user-defined metric descriptors in each GCP project. For details, see Quotas and limits.
If you discover that you have created metric descriptors you no longer want, you
can find and delete the descriptors using the Monitoring API. For more
information, see projects.metricDescriptors
.
Troubleshooting
This section explains how to configure the Monitoring agent's
write_log
plugin to dump out the full set of metric points, including
metadata. This can be used to determine what points need to be transformed, as
well as to ensure your transformations behave as expected.
Enabling write_log
The write_log
plugin is included in the stackdriver-agent
package. To enable
the plugin:
As root, edit the following configuration file:
/etc/stackdriver/collectd.conf
Right after
LoadPlugin write_gcm
, add:LoadPlugin write_log
Right after
<Plugin "write_gcm">…</Plugin>
, add:<Plugin "write_log"> Format JSON </Plugin>
Search for
<Target "write">…</Target>
and after everyPlugin "write_gcm"
, add:Plugin "write_log"
Save your changes and restart the agent:
sudo service stackdriver-agent restart
These changes will print one log line per metric value reported, including the full collectd identifier, the metadata entries, and the value.
Output of write_log
If you were successful in the previous step, you should see the output of
write_log
in the system logs:
- Debian-based Linux:
/var/log/syslog
- Red Hat-based Linux:
/var/log/messages
The sample lines below have been formatted to make them easier to read in this document.
Dec 8 15:13:45 test-write-log collectd[1061]: write_log values:#012[{
"values":[1933524992], "dstypes":["gauge"], "dsnames":["value"],
"time":1481210025.252, "interval":60.000,
"host":"test-write-log.c.test-write-log.internal",
"plugin":"df", "plugin_instance":"udev", "type":"df_complex", "type_instance":"free"}]
Dec 8 15:13:45 test-write-log collectd[1061]: write_log values:#012[{
"values":[0], "dstypes":["gauge"], "dsnames":["value"],
"time":1481210025.252, "interval":60.000,
"host":"test-write-log.c.test-write-log.internal",
"plugin":"df", "plugin_instance":"udev", "type":"df_complex", "type_instance":"reserved"}]