Logging query language

This page describes the logging query language that is used to query your logs data and create logs sinks.

Queries can be used in the Cloud Logging query-builder pane in the Google Cloud Console, the Logging API, or the command-line interface.

The query builder language syntax

This section discusses how queries are structured and how matching is performed.

Queries are expressions that can specify a set of log entries from any number of logs. A query is a Boolean expression that specifies a subset of all the log entries in your project.

You can build queries based on the following four dimensions using the logical operators AND and OR:

  • Resource: For more information, go to resource.type.
  • Log name: For more information, go to logName.
  • Severity: For more information, go to severity.
  • Time range: For more information, go to timestamp.

For examples of common queries you might want to use, go to Sample queries using Logs Viewer (Preview).

Syntax notation

The query syntax is described using the following notation:

  • a = e means that a is a name for the expression e.
  • a b means "a followed by b."
  • a | b means "a or b."
  • ( e ) is used for grouping.
  • [ e ] means that e is optional.
  • { e } means that e can be repeated zero or more times.
  • "abc" means that abc must be written just as it appears.

Syntax summary

This section provides a quick overview of the query syntax. Some details have been omitted, but they are explained in the following sections.

A query is a string containing an expression:

    expression = ["NOT"] comparison { ("AND" | "OR") ["NOT"] comparison }

A comparison is either a single value or a Boolean expression:

  "The cat in the hat"
  resource.type = "gae_app"

The first line is an example of a comparison that is a single value. These types of comparisons are global restrictions. Each field of a log entry is compared to the value by implicitly using the has operator. For this example, if any field in a LogEntry, or if its payload, contains the phrase "The cat in the hat", then the comparison is successful.

The second line is an example of a comparison that is a Boolean expression of the form [FIELD_NAME] [OP] [VALUE]. The elements of the comparison are described below:

  • [FIELD_NAME]: is a field in a log entry. For example, resource.type.

  • [OP]: is a comparison operator. For example, =.

  • [VALUE]: is a number, string, function, or parenthesized expression. For example, "gae_app".

The following sections provide more details about queries and matching.

Boolean operators

The Boolean operators AND and OR are short-circuit operators. The NOT operator has the highest precedence, followed by OR and AND in that order. For example, the following two expressions are equivalent:

a OR NOT b AND NOT c OR d
(a OR (NOT b)) AND ((NOT c) OR d)

You can omit the AND operator between comparisons. You can also replace the NOT operator with the - (minus) operator. For example, the following two queries are the same:

a=b AND c=d AND NOT e=f
a=b c=d -e=f

This documentation always uses AND and NOT.

Comparisons

Comparisons have the following form:

[FIELD_NAME] [OP] [VALUE]

The elements of the comparison are described below:

  • [FIELD_NAME]: is the path name of a field in a log entry. Examples of the field name are:

    resource.type
    resource.labels.zone
    resource.labels.project_id
    insertId
    jsonPayload.httpRequest.protocol
    labels."compute.googleapis.com/resource_id"
    

    For details, review field path identifiers on this page.

  • [OP]: is a comparison operator, one of the following:

    =           # equal
    !=          # not equal
    > < >= <=   # numeric ordering
    :           # "has" matches any substring in the log entry field
    =~          # regular expression search for a pattern
    !~          # regular expression search not for a pattern
    
  • [VALUE]: is a number, string, function, or parenthesized expression. Strings are used to represent arbitrary text, plus Boolean, enumeration, and byte-string values. The [VALUE] is converted to the field's type prior to the comparison.

If [VALUE] is a parenthesized Boolean combination of comparisons, then the field name and the comparison operator are applied to each element. For example:

    jsonPayload.cat = ("longhair" OR "shorthair")
    jsonPayload.animal : ("nice" AND "pet")

The first comparison checks that the field cat has the value "longhair" or "shorthair". The second checks that the value of the field animal contains both of the words "nice" and "pet", in any order.

Field path identifiers

All log entries are instances of type LogEntry. The identifier that is (or begins) the left-hand side of a comparison must be a field defined in the LogEntry type. For details on the possible identifiers and their values, review the LogEntry type.

Here is the current list of log entry fields. Each field is followed by the next level of names for that field, if applicable:

  • httpRequest: { cacheFillBytes, cacheHit, cacheLookup, cacheValidatedWithOriginServer, latency, protocol, referer, remoteIp, requestMethod, requestSize, requestUrl, responseSize, serverIp, status, userAgent }
  • insertId
  • jsonPayload { variable }
  • labels { variable }
  • logName
  • metadata { systemLabels, userLabels }
  • operation{ id, producer, first, last }
  • protoPayload { @type, variable }
  • receiveTimestamp
  • resource { type, labels }
  • severity
  • sourceLocation: { file, line, function }
  • spanId
  • textPayload
  • timestamp
  • trace

Following are examples of field path identifiers you can use in your comparisons:

  • resource.type: If your first path identifier is resource, then the next identifier must be a field in the MonitoredResource type.

  • httpRequest.latency: If your first path identifier is httpRequest, then the next identifier must be a field in the HttpRequest type.

  • labels.[KEY] If your first path identifier is labels, then the next identifier, [KEY], must be one of the keys from the key-value pairs appearing in the labels field.

  • logName: Since the logName field is a string, you can't follow it by any subfield names.

For more information on using field path identifiers that reference objects or arrays, go to Object and array types on this page.

Monitored resource types

For faster queries, specify a monitored resource type. For a list of resource types, go to Monitored resource types.

For example, Compute Engine VMs use the resource type gce_instance and Amazon EC2 instances use aws_ec2_instance. The following example shows how to limit your queries to both type of VMs:

    resource.type = ("gce_instance" OR "aws_ec2_instance")

The monitored resource type values in logs are indexed. Using substring matches for them results in slower queries.

Missing fields

If you use a field name in a query, and that field doesn't appear in a log entry, then the field is missing, undefined, or defaulted:

  • If the field is part of the log entry's payload (jsonPayload or protoPayload), or if it is in a label in the labels section of the log entry, then the field is missing. Using a missing field won't display an error, but all comparisons using missing fields fail silently.

    Examples: jsonPayload.nearest_store, protoPayload.name.nickname

  • If the field is defined in the LogEntry type, then the field is defaulted. Comparisons are performed as if the field were present and had its default value.

    Examples: httpRequest.remoteIp, trace, operation.producer

  • Otherwise, the field is undefined, which is an error that is detected before the query is used.

    Examples: thud, operation.thud, textPayload.thud

To test if a missing or defaulted field exists without testing for a particular value in the field, use the :* comparison. For example, the following comparison succeeds if the field operation.id is explicitly present in a log entry:

operation.id:*

Object and array types

Each log entry field can hold a scalar, object, or array.

  • A scalar field stores a single value, like 174.4 or -1. A string is also considered a scalar. Fields that can be converted to (or from) a string, such as Duration and Timestamp are also scalar types.

  • An object type stores a collection of named values, like the following JSON value:

    {"age": 24, "height": 67}
    

    You can refer to value inside an object. For example, if jsonPayload.x contained the preceding value, then jsonPayload.x.age would have the value 24.

  • An array field stores a list of values—all of the same type. For example, a field holding measurements might have an array of numbers:

    {8.5, 9, 6}
    

    When comparisons are performed and [FIELD_NAME] is an array field, each member of the array is compared to [VALUE] and the results are joined together using the OR operator. For example, if jsonPayload.shoeSize is an array field that stores {8.5, 9, 6}, the comparison:

    jsonPayload.shoeSize < 7
    

    is equivalent to:

    8.5 < 7 OR 9 < 7 OR 6 < 7
    

    In this example, the overall comparison evaluates to successful.

Values and conversions

The first step in evaluating a comparison is to convert the right-hand side value to the type of the log entry field. Scalar field types are permitted in comparisons, along with two additional types whose values are represented as strings: Duration and Timestamp. For a list of scalar types, go to the scalar protocol buffer types list. The following table explains what values can be converted to the log field types:

Field type Permitted query value
bool "True" or "false" in any letter case. Examples: "True", "true".
bytes A string containing any sequence of bytes. Example: "\377\377".
Duration A string containing a signed decimal number followed by one of the units "ns", "us", "ms", "s", "m", or "h". Durations are accurate to nanoseconds. Example: "3.2s".
enum The name of an enumeration type literal, case-insensitive. Examples: "WARNING", which is a value of type LogSeverity.
double Any number, with or without a sign and an exponent part, or the special value strings "NaN", "-Infinity", and "Infinity" (either capitalized or not). Examples: "-3.2e-8", "nan".
intNN Any signed integer that doesn't exceed the size of the type. Example: "-3".
string Any string that contains UTF-8 encoded or 7-bit ASCII text. Embedded quotation marks must be escaped with a backslash.
Timestamp A string in RFC 3339 or ISO 8601 format. Examples: "2014-10-02T15:01:23.045Z" (RFC 3339), "2014-10-02" (ISO 8601). In query expressions, timestamps in RFC 3339 format can specify a timezone with "Z" or ±hh:mm. Timestamps are represented to nanosecond accuracy.
uintNN Any unsigned integer that doesn't exceed the size of the type. Example: "1234".

If an attempted conversion fails, then the comparison fails.

When a conversion requires a string, you can also use a number or unquoted text if they don't contain special characters such as spaces and operators. Similarly, when a conversion requires a number, you can use a string whose content is a number.

The types intNN and uintNN represent integer types of various sizes, such as int32 and uint64. When writing a value to be converted to a 64-bit integer type, you write the value as a string, such as "9223372036854775807".

Types of log fields

Here is how the type of a log entry field is determined:

  • Log fields defined in the type LogEntry, and in the component type are protocol buffer fields. Protocol buffer fields have explicit types.

  • Log fields that are part of protoPayload objects are also protocol buffer fields and have explicit types. The name of the protocol buffer type is stored in the field "@type" of protoPayload. For more information, review the JSON mapping.

  • Log fields inside of jsonPayload have types that are inferred from the field's value when the log entry is received:

    • Fields whose values are unquoted numbers have type double.
    • Fields whose values are true or false have type bool.
    • Fields whose values are strings have type string.

    Long (64-bit) integers are stored in string fields, because they cannot be represented exactly as double values.

  • The Duration and Timestamp types are recognized only in protocol buffer fields {:class="external"}. Elsewhere, those values are stored in string fields.

Comparison operators

The meaning of the equality (=, !=) and inequality (<, <=, >, >=) operators depends on the underlying type of the left-hand field name.

  • All numeric types: Equality and inequality have their normal meaning for numbers.
  • bool: Equality means the same Boolean value. Inequality is defined by true>false.
  • enum: Equality means the same enumeration value. Inequality uses the underlying numeric values of the enumeration literals.
  • Duration: Equality means the same duration length. Inequality is based on the length of the duration. Example: as durations, "1s">"999ms".
  • Timestamp: Equality means the same instant in time. If a and b are Timestamp values, a < b means a is earlier in time than b.
  • bytes: Operands are compared byte by byte, left-to-right.
  • string: Comparisons ignore letter case. Specifically, both operands are first normalized using NFKC_CF Unicode normalization and then use lexicographic comparisons. However, regular expression searches are not normalized. For more information on searching log entries using regular expressions, see Querying and filtering log entries using regular expressions.

The substring operator (:) is applicable to string and bytes, and is handled like equality except that the right-hand operand need only equal some part of the left-hand field. Substring matches on indexed fields don't take advantage of log indexes.

Global restrictions

If the comparison consists of a single value, it is called a global restriction. Logging uses the has (:) operator to determine if any field in a log entry, or if its payload, contains the global restriction. If it does, then the comparison succeeds.

The simplest query written in terms of a global restriction is a single value:

"The Cat in The Hat"

You can combine global restrictions using the AND and OR operators for a more interesting query. For example, if you want to display all log entries that have a field that contains cat and a field that contains either hat or bat, write the query as:

(cat AND (hat OR bat))

In this case, there are three global restrictions: cat, hat and bat. These global restrictions are applied separately and the results are combined, just as if the expression had been written without parentheses.

A global restriction is an easy way to query your logs for a particular value. For example, if you are looking in your activity log for entries containing any mention of GCE_OPERATION_DONE, you can use the following query:

    logName = "projects/my-project-id/logs/compute.googleapis.com%2Factivity_log" AND
    "GCE_OPERATION_DONE"

Although global restrictions are easy, they can be slow; for more information, review Finding log entries quickly on this page.

Functions

You can use built-in functions as global restrictions in queries:

function = identifier ( [ argument { , argument } ] )

where argument is a value, field name, or a parenthesized expression. The functions are described in the following sections.

sample

The sample function selects a fraction of the total number of log entries:

sample([FIELD], [FRACTION])

[FIELD] is the name of a field in the log entry, such as logName or jsonPayload.a_field. The value of the field determines whether the log entry is in the sample. The field type must be a string or numeric value. Setting [FIELD] to insertId is a good choice, because every log entry has a different value for that field.

[FRACTION] is the fraction of log entries that have values for [FIELD] to include. It is a number greater than 0.0 and no greater than 1.0. For example, if you specify 0.01, then the sample contains roughly one percent of all log entries that have values for [FIELD]. If [FRACTION] is 1, then all the log entries that have values for [FIELD] are chosen.

Example: The following query returns 25 percent of the log entries from log syslog:

    logName = "projects/my-project/logs/syslog" AND sample(insertId, 0.25)

Details: A deterministic algorithm, based on hashing, is used to determine if a log entry is included, or excluded, from the sample. The accuracy of the resulting sample is dependent on the distribution of the hashed values. If the hashed values aren't uniformly distributed, then the resulting sample can be skewed. In the worst case, when [FIELD] always contains the same value, the resulting sample contains either the [FRACTION] of all log entries or no log entries.

If [FIELD] does appear in a log entry, then:

  • A hash of the value is computed.
  • The hashed value, which is a number, is divided by the maximum possible hashed value.
  • If the resulting fraction is less than or equal to [FRACTION], the log entry is included in the sample; otherwise it is excluded from the sample.

If [FIELD] doesn't appear in a log entry, then:

  • If [FIELD] is part of the log entry's payload or labels sections, the log entry isn't selected for the sample, even if [FRACTION] is 1.
  • Otherwise, the log entry is treated as if [FIELD] is in the log entry and the value of [FIELD] is the default value. The default value is determined by the LogEntry type. For more information on missing and defaulted fields, review Missing fields on this page.

To exclude log entries with defaulted fields from the sample, use the field-exists operator, :*. The following query produces a 1 percent sample of log entries that have explicitly supplied a value for field:

field:* AND sample(field, 0.01)

ip_in_net

The ip_in_net function determines if an IP address in a log entry is contained in a subnet. You might use this to tell if a request comes from an internal or external source. For example:

ip_in_net([FIELD], [SUBNET])

[FIELD] is a string-valued field in the log entry that contains an IP address or range. The field can be repeating, in which case only one of the repeated fields has to have an address or range contained in the subnet.

[SUBNET] is a string constant for an IP address or range. It is an error if [SUBNET] isn't a legal IP address or range, as described later in this section.

Example: The following query tests an IP address in the payload of log entries from the log my_log:

    logName = "projects/my_project/logs/my_log" AND
    ip_in_net(jsonPayload.realClientIP, "10.1.2.0/24")

Details: If, in a log entry, [FIELD] is missing, defaulted, or it does not contain a legal IP address or range, then the function returns false. For more information on missing and defaulted fields, review Missing fields on this page.

Examples of the supported IP addresses and ranges follow:

  • IPv4: 10.1.2.3
  • IPv4 subnet: 10.1.2.0/24
  • CIDR IPv6: 1234:5678:90ab:cdef:1234:5678:90ab:cdef
  • CIDR IPv6 subnet: 1:2::/48

Searching by time

In the interface, you can set specific limits on the date and time of log entries to show. For example, if you add the following conditions to your query, the Logs Viewer displays exactly the log entries in the indicated 30-minute period and you won't be able to scroll outside of that date range:

timestamp >= "2016-11-29T23:00:00Z"
timestamp <= "2016-11-29T23:30:00Z"

When writing a query with a timestamp, you must use dates and times in the format shown above. You must also select No limit from the time-range selector below the search-query box.

Using regular expressions

You can use regular expressions to build queries and create filters for sinks, metrics, and wherever log filters are used. You can use regular expressions in the Query builder and with gcloud command-line tool.

A regular expression is a sequence of characters that define a search. The query language uses the RE2 syntax. For a complete explanation of the RE2 syntax, go to the RE2 wiki on GitHub.

Regular expression queries have the following characteristics:

  • Only fields of the string type can be matched with a regular expression.

  • String normalization isn't performed; for example, kubernetes isn't considered the same as KUBERNETES. For more information, see the Comparison operators section.

  • Queries are case sensitive and not anchored by default.

  • Boolean operators can be used between multiple regular expressions on the right side of the regular expression comparison operator, =~ and !~.

A regular expression query has the following structure:

Match a pattern:

jsonPayload.message =~ "regular expression pattern"

Does not match a pattern:

jsonPayload.message !~ "regular expression pattern"

The =~ and !~ changes the query to a regular expression query, and the pattern you're trying to match must be within double quotation marks. To query for patterns that contain double quotation marks, escape them using a backslash.

Examples querying logs using regular expressions

Query type Example
Standard query sourceLocation.file =~ "foo"
Query with case-insensitive search labels.subnetwork_name =~ "(?i)foo"
Query containing quotation marks jsonPayload.message =~ "field1=\"bar.*\""
Query using a boolean or labels.pod_name =~ "(foo|bar)"
Query using anchors logName =~ "/my%2Flog$"
Query not matching a pattern labels.pod_name !~ "foo"
Query using boolean operator labels.env =~ ("^prod.*server" OR "^staging.*server")