About the MQL language

This page provides general information about Monitoring Query Language (MQL), including the following topics:

This information applies whether you use MQL from the Google Cloud Console or from the Cloud Monitoring API. For information about the structure of MQL queries, see Examples.

Shortcuts for table operations and functions

Queries usually consist of connected chains of table operations connected by pipes (|), each of which starts with the name of the table operation followed by a list of expressions. Expressions can contain function calls that list all of their arguments explicitly. But Monitoring Query Language allows queries to be expressed with a number of shortcuts.

This section describes shortcuts for table operations, using functions as table operations, and a shortcut for value columns as arguments to functions.

For a full list, see Table operation shortcuts.

Shortcuts for table operations

When using the fetch, group_by, and filter operations, you can omit the explicit table operation when the arguments are sufficient to determine the intended operation. For example, the following query:

gce_instance::compute.googleapis.com/instance/cpu/utilization

is equivalent to:

fetch gce_instance::compute.googleapis.com/instance/cpu/utilization

The following group_by operations are equivalent:

         [zone], mean(val())
group_by [zone], mean(val())

You can omit the word filter if you parenthesize the filter test. For example, the following two filter operations are the equivalent:

       (instance_name =~ 'apache.*')
filter instance_name =~ 'apache.*'

You can combine these shortcut forms in your queries. For example, the following query:

gce_instance::compute.googleapis.com/instance/cpu/utilization
| (instance_name =~ 'apache.*')
|  [zone], mean(val())

is equivalent to this more explicit form:

fetch gce_instance::compute.googleapis.com/instance/cpu/utilization
| filter instance_name =~ 'apache.*'
| group_by [zone], mean(val())

For more information on shortcuts for table operations, see Table operation shortcuts in the MQL reference.

Using a function as a table operation

A table operation usually starts with the name of a table operation. But MQL allows a table operation to start with a function name instead.

You can use a function name when the named function is to perform some transformation of the value columns of the input table. This replacement is a shortcut for the group_by, align, or value table operations, depending on the kind of function whose name is given.

The general form is:

|  FUNCTION_NAME ARG, ARG ... 

In the table operation, the function takes the value columns of the input table as arguments, arguments followed by any arguments for the function itself. When using a function as a table operation, you specify the arguments in the table-operation form, as a comma-separated list, rather than with the surrounding parentheses (()) ordinarily used with functions.

The full table operation generated by expanding the shortcut depends on the kind of function:

  • group_by: If you are using an aggregating function with a group_by operation that aggregates away all the time-series identifier columns (that is, it uses [] for grouping), then you can use the function as a shortcut. For example,

    | distribution powers_of(1.1)

    is a shortcut for

    | group_by [], distribution(val(0), powers_of(1.1))
  • align: If you are using an aligning function as an argument to the align operation, you can use the function as a shortcut. For example,

    | delta

    is a shortcut for

    | align delta()

    Similarly,

    | rate 10m

    is a shortcut for

    | align rate(10m)

    Note that aligner functions take the input time series as an implicit argument, so the value columns are not given explicitly here.

  • value: All other functions can act as shortcuts for the value table operation. For example,

    | mul 3.3

    is a shortcut for

    | value mul(val(0), 3.3)

    Similarly,

    | div

    is a shortcut for

    | value div(val(0), val(1))

    Note that the div shortcut an input table operation with two value columns and produces a table operation with one value column that is the ratio.

Shortcut for value-column functions

You can use .function as a shortcut for function(val()) if there is a single value column in the input, or as a shortcut for function(val(0), val(1)) if there there are two value columns, and so forth.

The leading dot means, “Call the following function, supplying the input-point value column (or columns) as the argument(s) to the function.”

For example, .mean is a shortcut for mean(val()). The following are equivalent:

group_by [zone], .mean
group_by [zone], mean(val())

If the input table has multiple value columns, each column becomes an argument to the function in this shortcut. For example, if the input table has two value columns, then

.div

is a shortcut for

div(val(0), val(1))

With this shortcut, you can supply arguments that do not refer to value columns. The additional arguments are supplied after the value-column arguments. For example, if the input table has one value column, then

.div(3)

is equivalent to

div(val(0), 3)

Variations on a fetch

The fetch operation usually returns a time-series table named by a pair of monitored-resource and metric types. For example:

fetch gce_instance :: compute.googleapis.com/instance/cpu/utilization

If the metric applies only to one monitored-resource type, then you can omit the monitored resource from the query. The following query is equivalent to the previous query, because the CPU-utilization metric applies only to gce_instance monitored resources:

fetch compute.googleapis.com/instance/cpu/utilization

The fetch operation can specify only a monitored-resource type, with the metric specified in a subsequent metric operations. For example, this example is equivalent to the previous fetch examples:

fetch gce_instance
| metric metric compute.googleapis.com/instance/cpu/utilization

Splitting the fetch this way can be useful when you want to fetch two different metrics for the same monitored resource. For example, the following query computes the number of packets per CPU-second consumed:

fetch gce_instance
| {
    metric compute.googleapis.com/instance/network/received_packets_count ;
    metric compute.googleapis.com/instance/cpu/usage_time
  }
| ratio

Splitting the fetch also lets you apply filtering only to the labels of the monitored resource:

fetch gce_instance
| filter resource.zone =~ "asia.*"
| {
    metric compute.googleapis.com/instance/network/received_packets_count ;
    metric compute.googleapis.com/instance/cpu/usage_time
  }
| ratio

A fetch that names only a monitored-resource type must be followed by a metric operation, perhaps with intervening filter operations.

Strict-form queries

A strict query is one with no shortcuts or implicit values:

  • All shortcuts are replaced.
  • All implicit arguments are made explicit.
  • Columns are referred to by full names.
  • New columns are explicitly given names.
  • Any implicitly supplied alignment operations are given explicitly.

Using the strict form makes the query more resilient to changes in the structure of input tables, and it can make it clearer what the query is doing. Putting a query in strict form does not make the query any more efficient.

When you save a query for a chart or an alerting policy, it is converted to strict form. The confirmation dialog for the save operation displays the strict form.

With both shortcuts and strict forms available, you might encounter equivalent MQL queries that look very different from each other. For example, the following query, which computes the number of received packets per consumed CPU second, uses many shortcuts:

gce_instance
| (zone =~ ".*-a")
| {
    compute.googleapis.com/instance/network/received_packets_count ;
    compute.googleapis.com/instance/cpu/usage_time
  }
| join
| div

When you save this query to a chart or as part of an alerting policy, the resulting strict form query does exactly the same thing. However, the strict form might look quite different, as shown in the following example:

fetch gce_instance
| filter (resource.zone =~ '.*-a')
| { t_0:
      metric 'compute.googleapis.com/instance/network/received_packets_count'
      | align delta() ;
    t_1:
      metric 'compute.googleapis.com/instance/cpu/usage_time'
      | align delta() }
| join
| value [v_0: div(t_0.value.received_packets_count, t_1.value.usage_time)]

When you edit the saved definition of the chart or alerting policy, the Query Editor displays the strict form.

Matching the resource.project_id column

Google Cloud projects have a display name, which appears in menus but does not uniquely identify the project. A project display name might be "Monitoring demo".

Projects also have two fields that act as identifiers:

  • Project ID: a unique string identifier. This is often based on the display name. Project IDs are set when the project is created, usually by concatenating the elements of the project name and possibly adding digits to the end, if needed for uniqueness. A project ID might have the form "monitoring-demo" or "monitoring-demo-2349". The project ID is sometimes casually called the project name.
  • Project number: a unique numeric identifier.

Every monitored-resource type includes a project_id label, with a string representation of the project number of the project that owns the resource and the data about that resource.

In MQL queries, you refer to this label as resource.project_id. The resource.project_id label has the project number in text form as its value, but MQL converts that value to the project ID in certain situations.

In the following cases, MQL treats the value of the resource.project_id label as the project ID rather than the project number:

  • The legend for a chart displays the project ID rather than the project number for the value of the resource.project_id label.

  • Equality comparisons of the value of the resource.project_id to a string literal recognize both the project number and the project ID. For example, both the following returns true for resources owned by this project:

    • resource.project_id == "monitoring-demo"
    • resource.project_id == "530310927541"

    This case applies for the == and != operators and for their function forms, eq() and ne().

  • A regular-expression match on the resource.project_id label work properly against either the project number or project ID. For example, both of the following expressions return true for resources owned by this project:

    • resource.project_id =~ "monitoring-.*"
    • resource.project_id =~ ".*27541"

    This case applies for the =~ and !~ operators and for the function form, re_full_match.

For all other cases, the actual value of the resource.project_id label is used. For example, concatenate("project-", resource.project_id) results in the value project-530310927541 and not project-monitoring-demo.

Ratios and the “edge effect”

In general, it is best to compute ratios based on time series collected for a single metric type, by using label values. A ratio computed over two different metric types is subject an edge effect.

For example, suppose that you have two different metric types, an RPC total count and an RPC error count, and you want to compute the ratio of error-count RPCs over total RPCs. The unsuccessful RPCs are counted in time series of both metric types, so there is a chance that, when you align the time series, an unsuccessful RPC can appear in one alignment interval in the total count but in a different alignment interval for the error count. This difference can happen for several reasons, including the following:

  • Because there are two different time series recording the same event, there are two underlying counter values implementing the collection, and they won't be updated atomically.
  • The sampling rates might differ. When the time series are aligned to a common period, counts for a single event might appear in adjacent alignment intervals in the time series for the different metrics.

The difference in the number of values in corresponding alignment intervals can lead to nonsensical error/total ratio values like 1/0 or 2/1.

The edge effect is typically less for ratios between larger numbers. You can get larger numbers by aggregation, either by using an alignment window that is longer than the sampling period, or by grouping together data for certain labels. These techniques minimize the effect of small differences in the number of points in a given interval; a two-point disparity is more significant if the expected number of points in an interval is three than if the expected number is 300.

If you are using built-in metric types, then you might have no choice but to compute ratios across metric types to get the value you need.

If you are designing custom metrics that might count the same thing—like RPCs returning error status—in two different metrics, consider instead a single metric, which includes each count only once. For example, if you are counting RPCs and you want to track the ratio of unsuccessful RPCs to all RPCs, create a single metric type to count RPCs, and use a label to record the status of the invocation, including the "OK" status. Then each status value, error or "OK", is recorded by updating single counter for that case.

MQL date formats

MQL currently supports a limited number of date formats. In MQL queries, dates are expressed as one of the following:

  • d'BASE_STRING'
  • D'BASE_STRING'

The BASE_STRING is a string of the form 2010/06/23-19:32:15-07:00. The first dash (-), separating the date and time, can be replaced by a space. In the time component, parts of the clock time (19:32:15) or the timezone specifier (-07:00) can be dropped.

The following examples are valid dates in MQL queries:

  • d'2010/06/23-19:32:15-07:00'
  • d'2010/06/23 19:32:15-07:00'
  • d'2010/06/23 19:32:15'
  • D'2010/06/23 19:32'
  • d'2010/06/23-19'
  • D'2010/06/23 -07:00'

The following table lists the grammar for the BASE_STRING:

Structure Meaning
%Y/%m/%d Date
%Y/%m/%d %H
%Y/%m/%d-%H
Date, hour
%Y/%m/%d %H:%M
%Y/%m/%d-%H:%M
Date, hour, minute
%Y/%m/%d %H:%M:%S
%Y/%m/%d-%H:%M:%S
Date, hour, minute, second
%Y/%m/%d %H:%M:%E*S
%Y/%m/%d-%H:%M:%E*S
Date, hour, minute, fractional second
%Y/%m/%d %Ez Date with timezone
%Y/%m/%d %H%Ez
%Y/%m/%d-%H%Ez
Date, hour, with timezone
%Y/%m/%d %H:%M%Ez
%Y/%m/%d-%H:%M%Ez
Date, hour, minute, with timezone
%Y/%m/%d %H:%M:%S%Ez
%Y/%m/%d-%H:%M:%S%Ez
Date, hour, minute, second, with timezone
%Y/%m/%d %H:%M:%E*S%Ez
%Y/%m/%d-%H:%M:%E*S%Ez
Date, hour, minute, fractional second, with timezone

Length and complexity of queries

Monitoring Query Language queries can be long and complex, but not without limits.

  • A query text, encoded as UTF-8, is limited to 10,000 bytes.
  • A query is limited to 2,000 language constructs; that is, the AST complexity is limited to 2,000 nodes.

An abstract syntax tree, or AST, is a representation of source code—in this case, the MQL query string—in which nodes in the tree map to syntactic structures in the code.

MQL macros

MQL includes a macro-definition utility, def. You can use the MQL macro-definition utility to replace repeated operations, make complex queries more readable, and make query development easier. You can define macros for table operations and for functions.

Macros for table operations

You can write macros to make new table operations. The general syntax looks like the following:

def MACRO_NAME [MACRO_PARAMETER[, MACRO_PARAMETER]] = MACRO_BODY ;

To invoke the macro, use the following syntax:

@MACRO_NAME [MACRO_ARG [, MACRO_ARG]]

For example, suppose you are using the following query to fetch CPU utilization data:

fetch gce_instance::compute.googleapis.com/instance/cpu/utilization
| every 1m
| group_by [zone], mean(val())

The first line can be replaced with the following macro:

def my_fetch = fetch gce_instance::compute.googleapis.com/instance/cpu/utilization ;

To invoke the macro in the query, replace the original fetch as follows:

def my_fetch = fetch gce_instance::compute.googleapis.com/instance/cpu/utilization ;

@my_fetch
| every 1m
| group_by [zone], mean(val())

You can replace the second and third lines with macros that take arguments. The macro definition lists the parameters to the macro, and in the macro body, refer to the parameters to the macro as $MACRO_PARAMETER. For example, you can define the following macros:

def my_every time_arg = every $time_arg ;

def my_group label, aggr = group_by [$label], $aggr ;

To invoke these macros and provide the arguments, specify the arguments in a comma-delimited list in the macro invocations. The following shows the query with all the defined macros and their invocations:

def my_fetch = fetch gce_instance::compute.googleapis.com/instance/cpu/utilization ;
def my_every time_arg = every $time_arg ;
def my_group label, aggr = group_by [$label], $aggr ;

{@my_fetch}
| @my_every 1m
| @my_group zone, mean(val())

Macros for functions

For MQL functions, you specify any parameters in a comma-delimited list in parentheses. The parentheses distinguish a function macro from a table-operation macro. The parenthese must appear in the invocation, even if there are no arguments.

def MACRO_NAME([MACRO_PARAMETER [, MACRO_PARAMETER]]) = MACRO_BODY ;

For example, the following query retrieves tables for two metrics, combines the two tables into one with two value columns, and computes the ratio of received bytes to total bytes for a column called received_percent:

{
  fetch k8s_pod :: kubernetes.io/pod/network/received_bytes_count ;
  fetch k8s_pod :: kubernetes.io/pod/network/sent_bytes_count
}
| join
| value [received_percent: val(0) * 100 / (val(0) + val(1))]

You can replace the received_percent computation with a macro like the following example:

def recd_percent(recd, sent) = $recd * 100 / ($recd + $sent) ;

To invoke a function macro, use the following syntax:

@MACRO_NAME([MACRO_ARG[, MACRO_ARG]])

When invoking a function macro with no arguments, you must specify the empty parentheses to distinguish the invocation from the invocation of a table-operation macro.

The following example shows the previous query with a macro for the ratio computation:

def recd_percent(recd, sent) = $recd * 100 / ($recd + $sent) ;

{
  fetch k8s_pod :: kubernetes.io/pod/network/received_bytes_count ;
  fetch k8s_pod :: kubernetes.io/pod/network/sent_bytes_count
}
| join
| value [received_percent: @recd_percent(val(0), val(1))]

Limitations

The MQL macro feature does not support the following:

  • Nesting of macros.
  • Recursively defined macros. No macro body can reference any macro, including itself, that is not yet fully defined.
  • Use of macro-defined functions as table operations.
  • Use of macro arguments as names of functions or table operations.