使用管道查询语法

如需将项目注册到管道语法预览版,请填写管道语法注册表单

管道查询语法是 GoogleSQL 的扩展,支持线性查询结构,旨在让查询更易于读取、写入和维护。您可以在编写 GoogleSQL 的任何位置使用管道语法。

管道语法支持与现有 GoogleSQL 查询语法标准语法相同的操作,例如选择、汇总和分组、联接和过滤,但这些操作可以按任何顺序应用任意次数。借助管道语法的线性结构,您可以编写查询,使查询语法的顺序与构建结果表所采取的逻辑步骤顺序一致。

使用管道语法的查询的定价、执行和优化方式与等效的标准语法查询相同。使用管道语法编写查询时,请遵循相关准则来估算费用优化查询计算

标准语法存在一些问题,可能会导致难以读取、编写和维护。下表显示了管道语法如何解决这些问题:

标准语法 管道语法
子句必须按特定顺序显示。 可以按任意顺序应用管道运算符。
更复杂的查询(例如包含多级汇总的查询)通常需要 CTE 或嵌套子查询。 更复杂的查询通常通过在查询末尾添加管道运算符来表达。
在聚合期间,列会在 SELECTGROUP BYORDER BY 子句中重复。 每项汇总只能列出一次列。

如需了解完整的语法详情,请参阅管道查询语法参考文档。

基本语法

在管道语法中,查询以标准 SQL 查询或 FROM 子句开头。例如,独立的 FROM 子句(例如 FROM MyTable)是有效的管道语法。然后,标准 SQL 查询的结果或 FROM 子句中的表可以作为输入传递给管道符号 |>,后跟管道运算符名称和该运算符的任何实参。管道运算符会以某种方式转换表,并且该转换的结果可以传递给另一个管道运算符。

您可以在查询中使用任意数量的管道运算符,执行选择、排序、过滤、联接或汇总列等操作。管道运算符的名称与其标准语法对应项一致,并且通常具有相同的行为。标准语法和管道语法之间的主要区别在于查询的结构方式。随着查询表达的逻辑变得越来越复杂,查询仍可表示为管道运算符的线性序列,而无需使用深层嵌套的子查询,从而更易于阅读和理解。

管道语法具有以下主要特征:

  • 管道语法中的每个管道运算符都由管道符号 |>、运算符名称和任何参数组成:
    |> operator_name argument_list
  • 您可以将管道运算符添加到任何有效查询的末尾。
  • 您可以按任意顺序应用管道运算符,并且可以应用任意次数。
  • 管道语法适用于支持标准语法的任何位置:查询、视图、表值函数和其他上下文。
  • 在同一查询中,可以将管道语法与标准语法混合使用。例如,子查询可以使用与父级查询不同的语法。
  • 管道运算符可以看到管道前表格中存在的每个别名。
  • 查询可以FROM 子句开头,并且可以选择在 FROM 子句后添加管道运算符。

请参考下表:

CREATE OR REPLACE TABLE Produce AS (
  SELECT 'apples' AS item, 2 AS sales, 'fruit' AS category
  UNION ALL
  SELECT 'apples' AS item, 7 AS sales, 'fruit' AS category
  UNION ALL
  SELECT 'carrots' AS item, 0 AS sales, 'vegetable' AS category
  UNION ALL
  SELECT 'bananas' AS item, 15 AS sales, 'fruit' AS category);

以下每个查询都包含有效的管道语法,展示了如何顺序构建查询。

查询可以FROM 子句开头,且无需包含管道符:

-- View the table.
FROM Produce;

/*---------+-------+-----------+
 | item    | sales | category  |
 +---------+-------+-----------+
 | apples  | 7     | fruit     |
 | apples  | 2     | fruit     |
 | carrots | 0     | vegetable |
 | bananas | 15    | fruit     |
 +---------+-------+-----------*/

您可以使用 WHERE 管道运算符进行过滤:

-- Filter items with no sales.
FROM Produce
|> WHERE sales > 0;

/*---------+-------+-----------+
 | item    | sales | category  |
 +---------+-------+-----------+
 | apples  | 7     | fruit     |
 | apples  | 2     | fruit     |
 | bananas | 15    | fruit     |
 +---------+-------+-----------*/

如需执行汇总,请使用 AGGREGATE 管道运算符,后跟任意数量的汇总函数,后跟 GROUP BY 子句。GROUP BY 子句是 AGGREGATE 管道运算符的一部分,而不是由管道符号 (|>) 分隔。

-- Compute total sales by item.
FROM Produce
|> WHERE sales > 0
|> AGGREGATE SUM(sales) AS total_sales, COUNT(*) AS num_sales
   GROUP BY item;

/*---------+-------------+-----------+
 | item    | total_sales | num_sales |
 +---------+-------------+-----------+
 | apples  | 9           | 2         |
 | bananas | 15          | 1         |
 +---------+-------------+-----------*/

现在假设您有以下表格,其中包含每个项的 ID:

CREATE OR REPLACE TABLE ItemData AS (
  SELECT 'apples' AS item, '123' AS id
  UNION ALL
  SELECT 'bananas' AS item, '456' AS id
  UNION ALL
  SELECT 'carrots' AS item, '789' AS id
);

您可以使用 JOIN 管道运算符将上一个查询的结果与此表联接,以包含每个项的 ID:

FROM Produce
|> WHERE sales > 0
|> AGGREGATE SUM(sales) AS total_sales, COUNT(*) AS num_sales
   GROUP BY item
|> JOIN ItemData USING(item);

/*---------+-------------+-----------+-----+
 | item    | total_sales | num_sales | id  |
 +---------+-------------+-----------+-----+
 | apples  | 9           | 2         | 123 |
 | bananas | 15          | 1         | 456 |
 +---------+-------------+-----------+-----*/

与标准语法的主要区别

管道语法与标准语法有以下不同:

如需了解详情,请参阅管道运算符的完整列表。

使用场景

管道语法的常见用例包括:

  • 临时分析和增量查询构建:操作的逻辑顺序有助于更轻松地编写和调试查询。任何查询的前缀(包括管道符 |>)都是有效查询,可帮助您查看长查询中的中间结果。效率提升有助于加快整个组织的开发流程。
  • Log Analytics:Log Analytics 用户中流行的还有其他类型的管道语法。管道语法提供了熟悉的结构,可简化这些用户的使用体验。

管道语法中的其他功能

除了少数例外情况外,管道语法支持标准语法支持的所有运算符,并且语法相同。此外,管道语法引入了以下管道运算符。

EXTEND 管道运算符

借助 EXTEND 管道运算符,您可以将计算列附加到当前表。EXTEND 管道运算符与 SELECT *, new_column 语句类似,但可让您更灵活地引用列别名。

请考虑下表,其中包含每位学生的两次考试成绩:

CREATE OR REPLACE TABLE Scores AS (
  SELECT 'Alex' AS student, 9 AS score1, 10 AS score2, 10 AS points_possible
  UNION ALL
  SELECT 'Dana' AS student, 5 AS score1, 7 AS score2, 10 AS points_possible);

/*---------+--------+--------+-----------------+
 | student | score1 | score2 | points_possible |
 +---------+--------+--------+-----------------+
 | Alex    | 9      | 10     | 10              |
 | Dana    | 5      | 7      | 10              |
 +---------+--------+--------+-----------------*/

假设您要计算每位学生在考试中的平均原始分数和平均百分比分数。在标准语法中,SELECT 语句中的后续列无法看到之前的别名。为避免使用子查询,您必须重复平均值表达式:

SELECT student,
  (score1 + score2) / 2 AS average_score,
  (score1 + score2) / 2 / points_possible AS average_percent
FROM Scores;

EXTEND 管道运算符可以引用之前使用的别名,使查询更易于阅读且不易出错:

FROM Scores
|> EXTEND (score1 + score2) / 2 AS average_score
|> EXTEND average_score / points_possible AS average_percent
|> SELECT student, average_score, average_percent;

/*---------+---------------+-----------------+
 | student | average_score | average_percent |
 +---------+---------------+-----------------+
 | Alex    | 9.5           | .95             |
 | Dana    | 6.0           | 0.6             |
 +---------+---------------+-----------------*/

SET 管道运算符

借助 SET 管道运算符,您可以替换当前表中列的值。SET 管道运算符类似于 SELECT * REPLACE (expression AS column) 语句。您可以使用表别名限定列名称,从而引用原始值。

FROM (SELECT 3 AS x, 5 AS y)
|> SET x = 2 * x;

/*---+---+
 | x | y |
 +---+---+
 | 6 | 5 |
 +---+---*/

DROP 管道运算符

借助 DROP 管道运算符,您可以从当前表中移除列。DROP 管道运算符类似于 SELECT * EXCEPT(column) 语句。删除列后,您仍然可以通过使用表别名限定列名称来引用原始值。

FROM (SELECT 1 AS x, 2 AS y) AS t
|> DROP x;

/*---+
 | y |
 +---+
 | 2 |
 +---*/

RENAME 管道运算符

借助 RENAME 管道运算符,您可以重命名当前表中的列。RENAME 管道运算符类似于 SELECT * EXCEPT(old_column), old_column AS new_column 语句。

FROM (SELECT 1 AS x, 2 AS y, 3 AS z) AS t
|> RENAME y AS w;

/*---+---+---+
 | x | w | z |
 +---+---+---+
 | 1 | 2 | 3 |
 +---+---+---*/

AGGREGATE 管道运算符

如需使用管道语法执行汇总,请使用 AGGREGATE 管道运算符,后跟任意数量的汇总函数,再后跟 GROUP BY 子句。您无需在 SELECT 子句中重复列。

本部分中的示例使用 Produce 表:

CREATE OR REPLACE TABLE Produce AS (
  SELECT 'apples' AS item, 2 AS sales, 'fruit' AS category
  UNION ALL
  SELECT 'apples' AS item, 7 AS sales, 'fruit' AS category
  UNION ALL
  SELECT 'carrots' AS item, 0 AS sales, 'vegetable' AS category
  UNION ALL
  SELECT 'bananas' AS item, 15 AS sales, 'fruit' AS category);

/*---------+-------+-----------+
 | item    | sales | category  |
 +---------+-------+-----------+
 | apples  | 7     | fruit     |
 | apples  | 2     | fruit     |
 | carrots | 0     | vegetable |
 | bananas | 15    | fruit     |
 +---------+-------+-----------*/
FROM Produce
|> AGGREGATE SUM(sales) AS total, COUNT(*) AS num_records
   GROUP BY item, category;

/*---------+-----------+-------+-------------+
 | item    | category  | total | num_records |
 +---------+-----------+-------+-------------+
 | apples  | fruit     | 9     | 2           |
 | carrots | vegetable | 0     | 1           |
 | bananas | fruit     | 15    | 1           |
 +---------+-----------+-------+-------------*/

如果您准备在聚合后立即对结果进行排序,可以在 GROUP BY 子句中使用 ASCDESC 标记要排序的列。未标记的列不会排序。

如果您想对所有列进行排序,可以将 GROUP BY 子句替换为 GROUP AND ORDER BY 子句,后者默认会对每列进行升序排序。您可以在要按降序排序的列后面指定 DESC。例如,以下三个查询是等效的:

-- Use a separate ORDER BY clause.
FROM Produce
|> AGGREGATE SUM(sales) AS total, COUNT(*) AS num_records
   GROUP BY category, item
|> ORDER BY category DESC, item;
-- Explicitly mark how to order columns in the GROUP BY clause.
FROM Produce
|> AGGREGATE SUM(sales) AS total, COUNT(*) AS num_records
   GROUP BY category DESC, item ASC;
-- Only mark descending columns in the GROUP AND ORDER BY clause.
FROM Produce
|> AGGREGATE SUM(sales) AS total, COUNT(*) AS num_records
   GROUP AND ORDER BY category DESC, item;

使用 GROUP AND ORDER BY 子句的好处在于,您不必在两个位置重复列名称。

如需执行全表聚合,请使用 GROUP BY() 或完全省略 GROUP BY 子句:

FROM Produce
|> AGGREGATE SUM(sales) AS total, COUNT(*) AS num_records;

/*-------+-------------+
 | total | num_records |
 +-------+-------------+
 | 24    | 4           |
 +-------+-------------*/

JOIN 管道运算符

借助 JOIN 管道运算符,您可以将当前表与另一个表联接,并且支持标准的联接操作,包括 CROSSINNERLEFTRIGHTFULL

以下示例引用了 ProduceItemData 表:

CREATE OR REPLACE TABLE Produce AS (
  SELECT 'apples' AS item, 2 AS sales, 'fruit' AS category
  UNION ALL
  SELECT 'apples' AS item, 7 AS sales, 'fruit' AS category
  UNION ALL
  SELECT 'carrots' AS item, 0 AS sales, 'vegetable' AS category
  UNION ALL
  SELECT 'bananas' AS item, 15 AS sales, 'fruit' AS category);
CREATE OR REPLACE TABLE ItemData AS (
  SELECT 'apples' AS item, '123' AS id
  UNION ALL
  SELECT 'bananas' AS item, '456' AS id
  UNION ALL
  SELECT 'carrots' AS item, '789' AS id
);

以下示例使用 USING 子句,避免了列歧义:

FROM Produce
|> JOIN `ItemData` USING(item)
|> WHERE item = 'apples';

/*--------+-------+----------+-----+
 | item   | sales | category | id  |
 +--------+-------+----------+-----+
 | apples | 2     | fruit    | 123 |
 | apples | 7     | fruit    | 123 |
 +--------+-------+----------+-----*/

如需引用当前表中的列(例如,对 ON 子句中的列进行解析),您需要使用 AS 管道运算符为当前表创建别名。您可以选择为联接表添加别名。您可以在后续的管道运算符后引用这两个别名:

FROM Produce
|> AS produce_table
|> JOIN `ItemData` AS item_table
   ON produce_table.item = item_table.item
|> WHERE produce_table.item = 'bananas'
|> SELECT item_table.item, sales, id;

/*---------+-------+-----+
 | item    | sales | id  |
 +---------+-------+-----+
 | bananas | 15    | 123 |
 +---------+-------+-----*/

联接的右侧无法看到联接的左侧,这意味着您无法将当前表与自身联接。例如,以下查询会失败:

-- This query doesn't work.
FROM Produce
|> AS produce_table
|> JOIN produce_table AS produce_table_2 USING(item);

如需与经过修改的表执行自联接,您可以在 WITH 子句中使用通用表表达式 (CTE)。

WITH cte_table AS (
  FROM Produce
  |> WHERE item = 'carrots'
)
FROM cte_table
|> JOIN cte_table AS cte_table_2 USING(item);

示例

请考虑下表,其中包含有关客户订单的信息:

CREATE OR REPLACE TABLE CustomerOrders AS (
  SELECT 1 AS customer_id, 100 AS order_id, 'WA' AS state, 5 AS cost, 'clothing' AS item_type
  UNION ALL
  SELECT 1 AS customer_id, 101 AS order_id, 'WA' AS state, 20 AS cost, 'clothing' AS item_type
  UNION ALL
  SELECT 1 AS customer_id, 102 AS order_id, 'WA' AS state, 3 AS cost, 'food' AS item_type
  UNION ALL
  SELECT 2 AS customer_id, 103 AS order_id, 'NY' AS state, 16 AS cost, 'clothing' AS item_type
  UNION ALL
  SELECT 2 AS customer_id, 104 AS order_id, 'NY' AS state, 22 AS cost, 'housewares' AS item_type
  UNION ALL
  SELECT 2 AS customer_id, 104 AS order_id, 'WA' AS state, 45 AS cost, 'clothing' AS item_type
  UNION ALL
  SELECT 3 AS customer_id, 105 AS order_id, 'MI' AS state, 29 AS cost, 'clothing' AS item_type);

假设您想要了解每个州和商品类型的回头客平均支出金额。您可以按如下方式编写查询:

SELECT state, item_type, AVG(total_cost) AS average
FROM
  (
    SELECT
      SUM(cost) AS total_cost,
      customer_id,
      state,
      item_type,
      COUNT(*) OVER (PARTITION BY customer_id) AS num_orders
    FROM CustomerOrders
    GROUP BY customer_id, state, item_type
    QUALIFY num_orders > 1
  )
GROUP BY state, item_type
ORDER BY state DESC, item_type ASC;

如果您从上到下读取查询,则会在 total_cost 列被定义之前遇到该列。即使在子查询中,您也需要先读取列的名称,然后才能看到它们来自哪个表。

为了理解此查询,需要从内到外进行阅读。stateitem_type 列在 SELECTGROUP BY 子句中重复多次,然后在 ORDER BY 子句中再次重复。

以下等效查询使用管道语法编写:

FROM CustomerOrders
|> AGGREGATE SUM(cost) AS total_cost, GROUP BY customer_id, state, item_type
|> EXTEND COUNT(*) OVER (PARTITION BY customer_id) AS num_orders
|> WHERE num_orders > 1
|> AGGREGATE AVG(total_cost) AS average GROUP BY state DESC, item_type ASC;

/*-------+------------+---------+
 | state | item_type  | average |
 +-------+------------+---------+
 | WA    | clothing   | 35.0    |
 | WA    | food       | 3.0     |
 | NY    | clothing   | 16.0    |
 | NY    | housewares | 22.0    |
 +-------+------------+---------*/

借助管道语法,您可以按照自己可能想到的解决原始问题的逻辑步骤编写查询。查询中的语法行对应于以下逻辑步骤:

  • 先从客户订单表格开始。
  • 按州了解每位客户在每种类型的商品上花了多少钱。
  • 统计每个客户的订单数量。
  • 将结果限制为回头客。
  • 查找回头客在每个州和商品类型上的平均支出金额。

限制

  • 您无法在 SELECT 语句中在管道运算符后面添加差分隐私子句。而是应使用标准语法中的差分隐私子句,并在查询后应用管道运算符。
  • 您无法在管道语法中使用命名窗口