Lexical structure and syntax

A GoogleSQL statement comprises a series of tokens. Tokens include identifiers, quoted identifiers, literals, keywords, operators, and special characters. You can separate tokens with comments or whitespace such as spaces, backspaces, tabs, or newlines.

Identifiers

Identifiers are names that are associated with columns, tables, fields, path expressions, and more. They can be unquoted or quoted and some are case-sensitive.

Unquoted identifiers

  • Must begin with a letter or an underscore (_) character.
  • Subsequent characters can be letters, numbers, or underscores (_).

Quoted identifiers

Identifier examples

Path expression examples:

-- Valid. _5abc and dataField are valid identifiers.
_5abc.dataField

-- Valid. `5abc` and dataField are valid identifiers.
`5abc`.dataField

-- Invalid. 5abc is an invalid identifier because it is unquoted and starts
-- with a number rather than a letter or underscore.
5abc.dataField

-- Valid. abc5 and dataField are valid identifiers.
abc5.dataField

-- Invalid. abc5! is an invalid identifier because it is unquoted and contains
-- a character that is not a letter, number, or underscore.
abc5!.dataField

-- Valid. `GROUP` and dataField are valid identifiers.
`GROUP`.dataField

-- Invalid. GROUP is an invalid identifier because it is unquoted and is a
-- stand-alone reserved keyword.
GROUP.dataField

-- Valid. abc5 and GROUP are valid identifiers.
abc5.GROUP

Function examples:

-- Valid. dataField is a valid identifier in a function called foo().
foo().dataField

Array access operation examples:

-- Valid. dataField is a valid identifier in an array called items.
items[OFFSET(3)].dataField

Named query parameter examples:

-- Valid. param and dataField are valid identifiers.
@param.dataField

Table name examples:

-- Valid table path.
myproject.mydatabase.mytable287
-- Valid table path.
myproject287.mydatabase.mytable
-- Invalid table path. The project name starts with a number and is unquoted.
287myproject.mydatabase.mytable
-- Invalid table name. The table name is unquoted and is not a valid
-- dashed identifier, as the part after the dash is neither a number nor
-- an identifier starting with a letter or an underscore.
mytable-287a
-- Valid table path.
my-project.mydataset.mytable
-- Valid table name.
my-table
-- Invalid table path because the dash is not in the first part
-- of the path.
myproject.mydataset.my-table
-- Invalid table path because a dataset name cannot contain dashes.
my-dataset.mytable

Path expressions

A path expression describes how to navigate to an object in a graph of objects and generally follows this structure:

path:
  [path_expression][. ...]

path_expression:
  [first_part]/subsequent_part[ { / | : | - } subsequent_part ][...]

first_part:
  { unquoted_identifier | quoted_identifier }

subsequent_part:
  { unquoted_identifier | quoted_identifier | number }
  • path: A graph of one or more objects.
  • path_expression: An object in a graph of objects.
  • first_part: A path expression can start with a quoted or unquoted identifier. If the path expressions starts with a reserved keyword, it must be a quoted identifier.
  • subsequent_part: Subsequent parts of a path expression can include non-identifiers, such as reserved keywords. If a subsequent part of a path expressions starts with a reserved keyword, it may be quoted or unquoted.

Examples:

foo.bar
foo.bar/25
foo/bar:25
foo/bar/25-31
/foo/bar
/25/foo/bar

Table names

A table name represents the name of a table.

  • Table names can be quoted identifiers or unquoted identifiers.
  • A table name that is an unquoted identifier can additionally include single dashes if the table name is referenced in a FROM or TABLE clause. Only the first identifier in the table path (the project ID or the table name) can have dashes. Dashes are not supported in datasets.
  • A table name can be a fully qualified table name (table path) that includes up to three quoted or unquoted identifiers:

    For example: myproject.mydataset.mytable

  • Table names can be path expressions.

  • Table names have case-sensitivity rules.

  • Table names have additional rules.

Examples:

my-project.mydataset.mytable
mydataset.mytable
my-table
mytable
`287mytable`

Column names

A column name represents the name of a column in a table.

  • Column names can be quoted identifiers or unquoted identifiers.
  • If unquoted, identifiers support dashed identifiers when referenced in a FROM or TABLE clause.
  • Column names have additional rules.

Examples:

columnA
column-a
`287column`

Field names

A field name represents the name of a field inside a complex data type such as a struct or JSON object.

  • A field name can be a quoted identifier or an unquoted identifier.
  • Field names must adhere to all of the rules for column names.

Literals

A literal represents a constant value of a built-in data type. Some, but not all, data types can be expressed as literals.