Configure the Ops Agent

Stay organized with collections Save and categorize content based on your preferences.

This document provides details about the Ops Agent's default and custom configurations. Read this document if any of the following applies to you:

  • You want to change the configuration of the Ops Agent to achieve the following goals:

  • You're interested in learning the technical details of the Ops Agent's configuration.

The Ops Agent also provides configuration instructions for collecting metrics and logs from supported third-party applications. See Monitoring third-party applications for the list of supported applications.

Configuration model

The Ops Agent uses a built-in default configuration; you can't directly modify this built-in configuration. Instead, you create a file of overrides that are merged with the built-in configuration when the agent restarts.

The building blocks of the configuration are as follows:

  • receivers: This element describes what is collected by the agent.
  • processors: This element describes how the agent can modify the collected information.
  • service: This element links receivers and processors together to create data flows, called pipelines. The service element contains a pipelines element, which can contain multiple pipelines.

The built-in configuration is made up of these elements, and you use the same elements to override that built-in configuration.

Built-in configuration

The built-in configuration for the Ops Agent defines the default collection for logs and metrics. The following shows the built-in configuration for Linux and for Windows:

Linux

By default, the Ops Agent collects file-based syslog logs and host metrics.

For more information about the metrics collected, see Metrics ingested by the receiver types.

logging:
  receivers:
    syslog:
      type: files
      include_paths:
      - /var/log/messages
      - /var/log/syslog
  service:
    pipelines:
      default_pipeline:
        receivers: [syslog]
metrics:
  receivers:
    hostmetrics:
      type: hostmetrics
      collection_interval: 60s
  processors:
    metrics_filter:
      type: exclude_metrics
      metrics_pattern: []
  service:
    pipelines:
      default_pipeline:
        receivers: [hostmetrics]
        processors: [metrics_filter]

Windows

By default, the Ops Agent collects Windows event logs from System, Application, and Security channels, as well as host metrics, IIS metrics, and SQL Server metrics.

For more information about the metrics collected, see Metrics ingested by the receiver types.

logging:
  receivers:
    windows_event_log:
      type: windows_event_log
      channels: [System, Application, Security]
  service:
    pipelines:
      default_pipeline:
        receivers: [windows_event_log]
metrics:
  receivers:
    hostmetrics:
      type: hostmetrics
      collection_interval: 60s
    iis:
      type: iis
      collection_interval: 60s
    mssql:
      type: mssql
      collection_interval: 60s
  processors:
    metrics_filter:
      type: exclude_metrics
      metrics_pattern: []
  service:
    pipelines:
      default_pipeline:
        receivers: [hostmetrics, iis, mssql]
        processors: [metrics_filter]

These configurations are discussed in more detail in Logging configuration and Metrics configuration.

User-specified configuration

To override the built-in configuration, you add new configuration elements to the user configuration file. Put your configuration for the Ops Agent in the following files:

  • For Linux: /etc/google-cloud-ops-agent/config.yaml.

  • For Windows: C:\Program Files\Google\Cloud Operations\Ops Agent\config\config.yaml.

Any user-specified configuration is merged with the built-in configuration when the agent restarts.

To override a built-in receiver, processor, or pipeline, redefine it in your config.yaml file by declaring it with the same identifier.

For example, the built-in configuration for metrics includes a hostmetrics receiver that specifies a 60-second collection interval. To change the collection interval for host metrics to 30 seconds, include a metrics receiver called hostmetrics in your config.yaml file that sets the collection_interval value to 30 seconds, as shown in the following example:

metrics:
  receivers:
    hostmetrics:
      type: hostmetrics
      collection_interval: 30s

For other examples of changing the built-in configurations, see Logging configuration and Metrics configuration.

You can also turn off the collection of logging or metric data. These changes are described in the example logging service configurations and metrics service configurations.

Logging configurations

The logging configuration uses the configuration model described previously:

  • receivers: This element describes the data to collect from log files; this data is mapped into a <timestamp, record> model.
  • processors: This optional element describes how the agent can modify the collected information.
  • service: This element links receivers and processors together to create data flows, called pipelines. The service element contains a pipelines element, which can include multiple pipeline definitions.

Each receiver and each processor can be used in multiple pipelines.

The following sections describe each of these elements.

Logging receivers

The receivers element contains a set of receivers, each identified by a RECEIVER_ID. A receiver describes how to retrieve the logs; for example, by tailing files, by using a TCP port, or from the Windows Event Log.

Structure of logging receivers

Each receiver must have an identifier, RECEIVER_ID, and include a type element. The valid types are:

  • files: Collect logs by tailing files on disk.
  • fluent_forward (available with Ops Agent versions 2.12.0 and later): Collect logs sent via the Fluent Forward protocol over TCP.
  • syslog: Collect syslog via TCP or UDP.
  • tcp (available with Ops Agent versions 2.3.0 and later): Collect JSON format logs by listening to a TCP port.
  • windows_event_log (Windows only): Collect Windows Event Logs.
  • systemd_journald (available with Ops Agent versions 2.4.0 and later on linux): Collect logs from the systemd-journald service.
  • Third-party application log receivers

The receivers structure looks like the following:

receivers:
  RECEIVER_ID:
    type: files
    ...
  RECEIVER_ID_2:
    type: syslog
    ...

Depending on the value of the type element, there might be other configuration options, as follows:

  • files receivers:

    • include_paths: Required. A list of filesystem paths to read by tailing each file. A wildcard (*) can be used in the paths; for example, /var/log/*.log (Linux) or C:\logs\*.log (Windows).

      For a list of common Linux application log files, see Common Linux log files.

    • exclude_paths: Optional. A list of filesystem path patterns to exclude from the set matched by include_paths.

    • record_log_file_path: Optional. If set to true, then the path to the specific file from which the log record was obtained appears in the output log entry as the value of the agent.googleapis.com/log_file_path label. When using a wildcard, only the path of the file from which the record was obtained is recorded.

    • wildcard_refresh_interval: Optional. The interval at which wildcard file paths in include_paths are refreshed. Given as a time duration, for example, 30s, 2m. This property might be useful under high logging throughputs where log files are rotated faster than the default interval. If not specified, the default interval is 60 seconds.

  • fluent_forward receivers:

    • listen_host: Optional. An IP address to listen on. The default value is 127.0.0.1.

    • listen_port: Optional. A port to listen on. The default value is 24224.

  • syslog receivers:

    • transport_protocol: Optional. Supported values: tcp, udp. The default is tcp.

      The following additional options can be used:

      • listen_host: Optional. An IP address to listen on. The default value is 0.0.0.0.

      • listen_port: Optional. A port to listen on. The default value is 5140.

  • tcp receivers:

    • format: Required. Log format. Supported value: json.

    • listen_host: Optional. An IP address to listen on. The default value is 127.0.0.1.

    • listen_port: Optional. A port to listen on. The default value is 5170.

  • windows_event_log receivers (for Windows only):

    • channels: Required. A list of Windows Event Log channels from which to read logs.

Examples of logging receivers

Sample files receiver:

receivers:
  RECEIVER_ID:
    type: files

    include_paths: [/var/log/*.log]
    exclude_paths: [/var/log/not-this-one.log]
    record_log_file_path: true

Sample fluent_forward receiver:

receivers:
  RECEIVER_ID:
    type: fluent_forward

    listen_host: 127.0.0.1
    listen_port: 24224

Sample syslog receiver:

receivers:
  RECEIVER_ID:
    type: syslog

    transport_protocol: tcp
    listen_host: 0.0.0.0
    listen_port: 5140

Sample tcp receiver:

receivers:
  RECEIVER_ID:
    type: tcp

    format: json
    listen_host: 127.0.0.1
    listen_port: 5170

Sample windows_event_log receiver (Windows only):

receivers:
  RECEIVER_ID:
    type: windows_event_log

    channels: [System,Application,Security]

Sample systemd_journald receiver:

receivers:
  RECEIVER_ID:
    type: systemd_journald

Special fields in structured payloads

For processors and receivers that can ingest structured data (the fluent_forward and tcp receivers and the parse_json processor), you can set special fields in the input that will map to specific fields in the LogEntry object that the agent writes to the Logging API.

When the Ops Agent receives external structured log data, it places top-level fields into the LogEntry's jsonPayload field unless the field name is listed in the following table:

Record field LogEntry field

Option 1


"timestamp": {
  "seconds": CURRENT_SECONDS,
  "nanos": CURRENT_NANOS,
}

Option 2


{
  "timestampSeconds": CURRENT_SECONDS,
  "timestampNanos": CURRENT_NANOS,
}
timestamp
receiver_id (not a record field) logName
logging.googleapis.com/httpRequest([string](/logging/docs/reference/v2/rest/v2/LogEntry#httprequest)) httpRequest
logging.googleapis.com/severity ([string](/logging/docs/reference/v2/rest/v2/LogEntry#logentrysourcelocation)) severity
logging.googleapis.com/labels (struct of string:string) labels
logging.googleapis.com/operation ([struct](/logging/docs/reference/v2/rest/v2/LogEntry#logentryoperation)) operation
logging.googleapis.com/sourceLocation ([struct](/logging/docs/reference/v2/rest/v2/LogEntry#logentrysourcelocation)) sourceLocation

Any remaining structured record fields remain part of the jsonPayload structure.

Common Linux log files

The following table lists common log files for frequently used Linux applications:

Application Common log files
apache For information about Apache log files, see Monitoring third-party applications: Apache Web Server.
cassandra For information about Cassandra log files, see Monitoring third-party applications: Cassandra.
chef /var/log/chef-server/bookshelf/current
/var/log/chef-server/chef-expander/current
/var/log/chef-server/chef-pedant/http-traffic.log
/var/log/chef-server/chef-server-webui/current
/var/log/chef-server/chef-solr/current
/var/log/chef-server/erchef/current
/var/log/chef-server/erchef/erchef.log.1
/var/log/chef-server/nginx/access.log
/var/log/chef-server/nginx/error.log
/var/log/chef-server/nginx/rewrite-port-80.log
/var/log/chef-server/postgresql/current
gitlab /home/git/gitlab/log/application.log
/home/git/gitlab/log/githost.log
/home/git/gitlab/log/production.log
/home/git/gitlab/log/satellites.log
/home/git/gitlab/log/sidekiq.log
/home/git/gitlab/log/unicorn.stderr.log
/home/git/gitlab/log/unicorn.stdout.log
/home/git/gitlab-shell/gitlab-shell.log
jenkins /var/log/jenkins/jenkins.log
jetty /var/log/jetty/out.log
/var/log/jetty/*.request.log
/var/log/jetty/*.stderrout.log
joomla /var/www/joomla/logs/*.log
magento /var/www/magento/var/log/exception.log
/var/www/magento/var/log/system.log
/var/www/magento/var/report/*
mediawiki /var/log/mediawiki/*.log
memcached For information about Memcached log files, see Monitoring third-party applications: Memcached.
mongodb For information about MongoDB log files, see Monitoring third-party applications: MongoDB.
mysql For information about MySQL log files, see Monitoring third-party applications: MySQL.
nginx For information about nginx log files, see Monitoring third-party applications: nginx.
postgres For information about PostgreSQL log files, see Monitoring third-party applications: PostgreSQL.
puppet /var/log/puppet/http.log
/var/log/puppet/masterhttp.log
puppet-enterprise /var/log/pe-activemq/activemq.log
/var/log/pe-activemq/wrapper.log
/var/log/pe-console-auth/auth.log
/var/log/pe-console-auth/cas_client.log
/var/log/pe-console-auth/cas.log
/var/log/pe-httpd/access.log
/var/log/pe-httpd/error.log
/var/log/pe-httpd/other_vhosts_access.log
/var/log/pe-httpd/puppetdashboard.access.log
/var/log/pe-httpd/puppetdashboard.error.log
/var/log/pe-httpd/puppetmasteraccess.log
/var/log/pe-mcollective/mcollective_audit.log
/var/log/pe-mcollective/mcollective.log
/var/log/pe-puppet-dashboard/certificate_manager.log
/var/log/pe-puppet-dashboard/event-inspector.log
/var/log/pe-puppet-dashboard/failed_reports.log
/var/log/pe-puppet-dashboard/live-management.log
/var/log/pe-puppet-dashboard/mcollective_client.log
/var/log/pe-puppet-dashboard/production.log
/var/log/pe-puppetdb/pe-puppetdb.log
/var/log/pe-puppet/masterhttp.log
/var/log/pe-puppet/rails.log
rabbitmq For information about RabbitMQ log files, see Monitoring third-party applications: RabbitMQ.
redis For information about Redis log files, see Monitoring third-party applications: Redis.
redmine /var/log/redmine/*.log
salt /var/log/salt/key
/var/log/salt/master
/var/log/salt/minion
/var/log/salt/syndic.loc
solr For information about Apache Solr log files, see Monitoring third-party applications: Apache Solr.
sugarcrm /var/www/*/sugarcrm.log
syslog /var/log/syslog/var/log/messages
tomcat For information about Apache Tomcat log files, see Monitoring third-party applications: Apache Tomcat.
zookeeper For information about Apache ZooKeeper log files, see Monitoring third-party applications: Apache ZooKeeper.

Default ingested labels

Logs can contain the following labels by default in the LogEntry:

Field Sample Value Description
labels."compute.googleapis.com/resource_name" test_vm The name of the virtual machine from which this log originates. Written for all logs.
labels."logging.googleapis.com/instrumentation_source" agent.googleapis.com/apache_access The value of the receiver type from which thus log originates, prefixed by agent.googleapis.com/. Written only by receivers from third-party integrations.

Logging processors

The optional processors element contains a set of processing directives, each identified by a PROCESSOR_ID. A processor describes how to manipulate the information collected by a receiver.

Each processor must have a unique identifier and include a type element. The valid types are:

  • parse_json: Parse JSON-formatted structured logs.
  • parse_multiline: Parse multiline logs. (Linux only)
  • parse_regex: Parse text-formatted logs via regex patterns to turn them into JSON-formatted structured logs.
  • exclude_logs: Exclude logs that match specified rules (starting in 2.9.0).
  • modify_fields: Set/transform fields in log entries (starting in 2.14.0).

The processors structure looks like the following:

processors:
  PROCESSOR_ID:
    type: parse_json
    ...
  PROCESSOR_ID_2:
    type: parse_regex
    ...

Depending on the value of the type element, there are other configuration options, as follows.

parse_json processor

Configuration structure

processors:
  PROCESSOR_ID:
    type: parse_json

    time_key:    <field name within jsonPayload>
    time_format: <strptime format string>

The parse_json processor parses the input JSON into the jsonPayload field of the LogEntry. Other parts of the LogEntry can be parsed by setting certain special top-level fields.

  • time_key: Optional. If the log entry provides a field with a timestamp, this option specifies the name of that field. The extracted value is used to set the timestamp field of the resulting LogEntry and is removed from the payload.

    If the time_key option is specified, you must also specify the following:

    • time_format: Required if time_key is used. This option specifies the format of the time_key field so it can be recognized and analyzed properly. For details of the format, see the strptime(3) guide.
Example configuration
processors:
  PROCESSOR_ID:
    type: parse_json

    time_key:    time
    time_format: "%Y-%m-%dT%H:%M:%S.%L%Z"

parse_multiline processor

Configuration structure

processors:
  PROCESSOR_ID:
    type: parse_multiline

    match_any:
    - type: <type of the exceptions>
      language: <language name>
  • match_any: Required. A list of one or more rules.

    • type: Required. Only a single value is supported:

      • language_exceptions: Allows the processor to concatenate exceptions into one LogEntry, based on the value of the language option.
    • language: Required. Only a single value is supported:

      • java: Concatenates common Java exceptions into one LogEntry.
      • python: Concatenates common Python exceptions into one LogEntry.
      • go: Concatenates common Go exceptions into one LogEntry.
Example configuration
processors:
  PROCESSOR_ID:
    type: parse_multiline

    match_any:
    - type: language_exceptions
      language: java

parse_regex processor

Configuration structure

processors:
  PROCESSOR_ID:
    type: parse_regex

    regex: <regular expression>

    time_key:    <field name within jsonPayload>
    time_format: <format string>
  • time_key: Optional. If the log entry provides a field with a timestamp, this option specifies the name of that field. The extracted value is used to set the timestamp field of the resulting LogEntry and is removed from the payload.

    If the time_key option is specified, you must also specify the following:

    • time_format: Required if time_key is used. This option specifies the format of the time_key field so it can be recognized and analyzed properly. For details of the format, see the strptime(3) guide.
  • regex: Required. The regular expression for parsing the field. The expression must include key names for the matched subexpressions; for example, "^(?<time>[^ ]*) (?<severity>[^ ]*) (?<msg>.*)$".

    The text matched by named capture groups will be placed into fields in the LogEntry's jsonPayload field. To add additional structure to your logs, use the modify_fields processor.

    For a set of regular expressions for extracting information from common Linux application log files, see Common Linux log files.

Example configuration
processors:
  PROCESSOR_ID:
    type: parse_regex

    regex:       "^(?<time>[^ ]*) (?<severity>[^ ]*) (?<msg>.*)$"
    time_key:    time
    time_format: "%Y-%m-%dT%H:%M:%S.%L%Z"

exclude_logs processor

Configuration structure:

type: exclude_logs
match_any:
  - <filter>
  - <filter>

The top-level configuration for this processor contains a single field, match_any, which contains a list of filter rules.

  • match_any: Required. A list of one or more rules. If a log entry matches any rule, then the Ops Agent doesn't ingest that entry.

    The logs that are ingested by Ops Agent follow the LogEntry structure. Field names are case-sensitive. You can only specify rules based on the following fields and their subfields:

    • httpRequest
    • jsonPayload
    • labels
    • operation
    • severity
    • sourceLocation

    The following example rule

    severity =~ "(DEBUG|INFO)"
    

    uses a regular expression to exclude all DEBUG and INFO level logs.

    Rules follow the Cloud Logging query language syntax but only support a subset of the features that Logging query language supports:

    • Comparison operators: =, !=, :, =~, !~. Only string comparisons are supported.
    • Navigation operator: .. For example jsonPayload.message.
    • Boolean operators: AND, OR, NOT.
    • Grouping expressions with ( ).

Example configuration

processors:
  PROCESSOR_ID:
    type: exclude_logs
    match_any:
    - '(jsonPayload.message =~ "log spam 1" OR jsonPayload.message =~ "log spam 2") AND severity = "ERROR"'
    - 'jsonPayload.application = "foo" AND severity = "INFO"'

modify_fields Processor

The modify_fields processor allows customization of the structure and contents of log entries.

Configuration structure

type: modify_fields
fields:
  <destination field>:
    # Source
    move_from: <source field>
    copy_from: <source field>
    static_value: <string>
    
    # Mutation
    default_value: <string>
    map_values:
      <old value>: <new value>
    type: {integer|float}
    omit_if: <filter>

The top-level configuration for this processor contains a single field, fields, which contains a map of output field names and corresponding translations. For each output field, an optional source and zero or more mutation operations are applied.

All field names use the dot-separated syntax from the Cloud Logging query language. Filters use the Cloud Logging query language.

All transformations are applied in parallel, which means that sources and filters operate on the original input log entry and therefore can not reference the new value of any other fields being modified by the same processor.

Source options: At most one specified source is allowed.

  • No source specified

    If no source value is specified, the existing value in the destination field will be modified.

  • move_from: <source field>

    The value from <source field> will be used as the source for the destination field. Additionally, <source field> will be removed from the log entry. If a source field is referenced by both move_from and copy_from, the source field will still be removed.

  • copy_from: <source field>

    The value from <source field> will be used as the source for the destination field. <source field> will not be removed from the log entry unless it is also referenced by a move_from operation or otherwise modified.

  • static_value: <string>

    The static string <string> will be used as the source for the destination field.

Mutation options: Zero or more mutation operators may be applied to a single field. If multiple operators are supplied, they will always be applied in the following order.

  1. default_value: <string>

    If the source field did not exist, the output value will be set to <string>. If the source field already exists (even if it contains an empty string), the original value is unmodified.

  2. map_values: <map>

    If the input value matches one of the keys in <map>, the output value will be replaced with the corresponding value from the map.

  3. map_values_exclusive: {true|false}

    In case the <source field> value does not match any keys specified in the map_values pairs, the destination field will be forcefully unset if map_values_exclusive is true, or left untouched if map_values_exclusive is false.

  4. type: {integer|float}

    The input value will be converted to an integer or a float. If the string cannot be converted to a number, the output value will be unset. If the string contains a float but the type is specified as integer, the number will be truncated to an integer.

    Note that the Cloud Logging API uses JSON and therefore it does not support a full 64-bit integer; if a 64-bit (or larger) integer is needed, it must be stored as a string in the log entry.

  5. omit_if: <filter>

    If the filter matches the input log record, the output field will be unset. This can be used to remove placeholder values, such as:

    httpRequest.referer:
      move_from: jsonPayload.referer
      omit_if: httpRequest.referer = "-"
    

Sample Configurations

The parse_json processor would transform a JSON file containing

{
  "http_status": "400",
  "path": "/index.html",
  "referer": "-"
}

into a LogEntry structure that looks like this:

{
  "jsonPayload": {
    "http_status": "400",
    "path": "/index.html",
    "referer": "-"
  }
}

This could then be transformed with modify_fields into this LogEntry:

{
  "httpRequest": {
    "status": 400,
    "requestUrl": "/index.html",
  }
}

using this Ops agent configuration:

logging:
  receivers:
    in:
      type: files
      include_paths:
      - /var/log/http.json
  processors:
    parse_json:
      type: parse_json
    set_http_request:
      type: modify_fields
      fields:
        httpRequest.status:
          move_from: jsonPayload.http_status
          type: integer
        httpRequest.requestUrl:
          move_from: jsonPayload.path
        httpRequest.referer:
          move_from: jsonPayload.referer
          omit_if: jsonPayload.referer = "-"
  pipelines:
    pipeline:
      receivers: [in]
      processors: [parse_json, set_http_request]

This configuration reads JSON-formatted logs from /var/log/http.json and populates part of the httpRequest structure from fields in the logs.

Logging service

The logging service customizes verbosity for the Ops Agent's own logs and links logging receivers and processors together into pipelines. The service section has two elements: log_level and pipelines.

Log verbosity level

log_level, available with Ops Agent versions 2.6.0 and later, customizes verbosity for Ops Agent logging submodule's own logs. The default is info. Available options are: error, warn, info, debug, trace.

The following configuration customizes log verbosity for the logging submodule to be debug instead:

logging:
  service:
    log_level: debug

Logging pipelines

pipelines can contain multiple pipeline IDs and definitions. Each pipeline definition consists of the following elements:

  • receivers: Required for new pipelines. A list of receiver IDs, as described in Logging receivers. The order of the receivers IDs in the list doesn't matter. The pipeline collects data from all of the listed receivers.

  • processors: Optional. A list of processor IDs, as described in Logging processors. The order of the processor IDs in the list does matter. Each record is run through the processors in the listed order.

Example logging service configurations

A service configuration has the following structure:

service:
  log_level: CUSTOM_LOG_LEVEL
  pipelines:
    PIPELINE_ID:
      receivers:  [...]
      processors: [...]
    PIPELINE_ID_2:
      receivers:  [...]
      processors: [...]

To stop the agent from collecting and sending either /var/log/message or /var/log/syslog entries, redefine the default pipeline with an empty receivers list and no processors. This configuration does not stop the agent's logging subcomponent, because the agent must be able to collect logs for the monitoring subcomponent. The entire empty logging configuration looks like the following:

logging:
  service:
    pipelines:
      default_pipeline:
        receivers: []

The following service configuration defines a pipeline with the ID custom_pipeline:

logging:
  service:
    pipelines:
      custom_pipeline:
        receivers:
        - RECEIVER_ID
        processors:
        - PROCESSOR_ID

Metrics configurations

The metrics configuration uses the configuration model described previously:

  • receivers: a list of receiver definitions. A receiver describes the source of the metrics; for example, system metrics like cpu or memory. The receivers in this list can be shared among multiple pipelines.
  • processors: a list of processor definitions. A processor describes how to modify the metrics collected by a receiver.
  • service: contains a pipelines section that is a list of pipeline definitions. A pipeline connects a list of receivers and a list of processors to form the data flow.

The following sections describe each of these elements.

Metrics receivers

The receivers element contains a set of receiver definitions. A receiver describes from where to retrieve the metrics, such as like cpu and memory. A receiver can be shared among multiple pipelines.

Structure of metrics receivers

Each receiver must have an identifier, RECEIVER_ID, and include a type element. Valid types are:

  • hostmetrics
  • iis (Windows only)
  • mssql (Windows only)

A receiver can also specify the operation collection_interval option. The value is in the format of a duration, for example, 30s or 2m. The default value is 60s.

Each of these receiver types collects a set of metrics; for information about the specific metrics included, see Metrics ingested by the receiver types.

You can create only one receiver for each type. For example, you can't define two receivers of type hostmetrics.

Changing the collection interval in the metrics receivers

Some critical workloads might require fast alerting. By reducing the the collection interval for the metrics, you can configure more sensitive alerts. For information on how alerts are evaluated, see Behavior of metric-based alerting policies.

For example, the following receiver changes the collection interval for host metrics (the receiver ID is hostmetrics) from the default of 60 seconds to 10 seconds:

metrics:
  receivers:
    hostmetrics:
      type: hostmetrics
      collection_interval: 10s

You can also override the collection interval for the Windows iis and mssql metrics receivers using the same technique.

Metrics ingested by the receivers

The metrics ingested by the Ops Agent have identifiers that begin with the following pattern: agent.googleapis.com/GROUP. The GROUP component identifies a set of related metrics; it has values like cpu, network, and others.

The hostmetrics receiver ingests the following metric groups. For more information, see the linked section for each group on the Ops Agent metrics page.

Group Metric
cpu CPU load at 1 minute intervals
CPU load at 5 minute intervals
CPU load at 15 minute intervals
CPU usage, with labels for CPU number and CPU state
CPU usage percent, with labels for CPU number and CPU state
disk Disk bytes read, with label for device
Disk bytes written, with label for device
Disk I/O time, with label for device
Disk weighted I/O time, with label for device
Disk pending operations, with label for device
Disk merged operations, with labels for device and direction
Disk operations, with labels for device and direction
Disk operation time, with labels for device and direction
Disk usage, with labels for device and state
Disk utilization, with labels for device and state
interface
Linux only
Total count of network errors
Total count of packets sent over the network
Total number of bytes sent over the network
memory Memory usage, with label for state (buffered, cached, free, slab, used)
Memory usage percent, with label for state (buffered, cached, free, slab, used)
network TCP connection count, with labels for port and TCP state
swap Swap I/O operations, with label for direction
Swap bytes used, with labels for device and state
Swap percent used, with labels for device and state
pagefile
Windows only
Current percentage of pagefile used by state
processes Processes count, with label for state
Processes forked count
Per-process disk read I/O, with labels for process name + others
Per-process disk write I/O, with labels for process name + others
Per-process RSS usage, with labels for process name + others
Per-process VM usage, with labels for process name + others

The iis receiver (Windows only) ingests the metrics of the iis group. For more information, see the Agent metrics page.

Group Metric
iis
Windows only
Currently open connections to IIS
Network bytes transferred by IIS
Connections opened to IIS
Requests made to IIS

The mssql receiver (Windows only) ingests metrics of the mssql group. For more information, see the Ops Agent metrics page.

Group Metric
mssql
Windows only
Currently open connections to SQL server
SQL server total transactions per second
SQL server write transactions per second

Metrics processors

The processor element contains a set of processor definitions. A processor describes metrics from the receiver type to exclude. The only supported type is exclude_metrics, which takes a metrics_pattern option. The value is a list of globs that match the metric types you want to exclude from the group collected by a receiver; for example, agent.googleapis.com/cpu/* or agent.googleapis.com/processes/*. To find the fully qualified names of individual metrics, see the group's table on the Ops Agent metrics page.

Sample metrics processor

The following example shows the exclude_metrics processor supplied in the built-in configurations. This processor supplies an empty metrics_pattern value, so it doesn't exclude any metrics.

processors:
  metrics_filter:
    type: exclude_metrics
    metrics_pattern: []

To disable the collection of all process metrics by the Ops Agent, add the following to your config.yaml file:

metrics:
  processors:
    metrics_filter:
      type: exclude_metrics
      metrics_pattern:
      - agent.googleapis.com/processes/*

This excludes process metrics from collection in the metrics_filter processor that applies to the default pipeline in the metrics service.

Metrics service

The metrics service customizes verbosity for the Ops Agent metrics module's own logs and links metrics receivers and processors together into pipelines. The service section has two elements: log_level and pipelines.

Metrics verbosity level

log_level, available with Ops Agent versions 2.6.0 and later, customizes verbosity for Ops Agent metrics submodule's own logs. The default is info. Available options are: error, warn, info, debug.

Metrics pipelines

The service section has a single element, pipelines, which can contain multiple pipeline IDs and definitions. Each pipeline definition consists of the following elements:

  • receivers: Required for new pipelines. A list of receiver IDs, as described in Metrics receivers. The order of the receivers IDs in the list doesn't matter. The pipeline collects data from all of the listed receivers.

  • processors: Optional. A list of processor IDs, as described in Metrics processors. The order of the processor IDs in the list does matter. Each metric point is run through the processors in the listed order.

Example metrics service configurations

A service configuration has the following structure:

service:
  log_level: CUSTOM_LOG_LEVEL
  pipelines:
    PIPELINE_ID:
      receivers:  [...]
      processors: [...]
    PIPELINE_ID_2:
      receivers:  [...]
      processors: [...]

To turn off the built-in ingestion of host metrics, redefine the default pipeline with an empty receivers list and no processors. The entire metrics configuration looks like the following:

metrics:
  service:
    pipelines:
      default_pipeline:
        receivers: []

The following example shows the built-in service configuration for Windows:

metrics:
  service:
    pipelines:
      default_pipeline:
        receivers:
        - hostmetrics
        - iis
        - mssql
        processors:
        - metrics_filter

The following service configuration customizes log verbosity for the metrics submodule to be debug instead:

metrics:
  service:
    log_level: debug