如需将项目注册到管道语法预览版,请填写管道语法注册表单。
管道查询语法是 GoogleSQL 的扩展,支持线性查询结构,旨在让查询更易于读取、写入和维护。您可以在编写 GoogleSQL 的任何位置使用管道语法。
管道语法支持与现有 GoogleSQL 查询语法或标准语法相同的操作,例如选择、汇总和分组、联接和过滤,但这些操作可以按任何顺序应用任意次数。借助管道语法的线性结构,您可以编写查询,使查询语法的顺序与构建结果表所采取的逻辑步骤顺序一致。
使用管道语法的查询的定价、执行和优化方式与等效的标准语法查询相同。使用管道语法编写查询时,请遵循相关准则来估算费用和优化查询计算。
标准语法存在一些问题,可能会导致难以读取、编写和维护。下表显示了管道语法如何解决这些问题:
标准语法 | 管道语法 |
---|---|
子句必须按特定顺序显示。 | 可以按任意顺序应用管道运算符。 |
更复杂的查询(例如包含多级汇总的查询)通常需要 CTE 或嵌套子查询。 | 更复杂的查询通常通过在查询末尾添加管道运算符来表达。 |
在聚合期间,列会在 SELECT 、GROUP BY 和 ORDER BY 子句中重复。 |
每项汇总只能列出一次列。 |
如需了解完整的语法详情,请参阅管道查询语法参考文档。
基本语法
在管道语法中,查询以标准 SQL 查询或 FROM
子句开头。例如,独立的 FROM
子句(例如 FROM MyTable
)是有效的管道语法。然后,标准 SQL 查询的结果或 FROM
子句中的表可以作为输入传递给管道符号 |>
,后跟管道运算符名称和该运算符的任何实参。管道运算符会以某种方式转换表,并且该转换的结果可以传递给另一个管道运算符。
您可以在查询中使用任意数量的管道运算符,执行选择、排序、过滤、联接或汇总列等操作。管道运算符的名称与其标准语法对应项一致,并且通常具有相同的行为。标准语法和管道语法之间的主要区别在于查询的结构方式。随着查询表达的逻辑变得越来越复杂,查询仍可表示为管道运算符的线性序列,而无需使用深层嵌套的子查询,从而更易于阅读和理解。
管道语法具有以下主要特征:
- 管道语法中的每个管道运算符都由管道符号
|>
、运算符名称和任何参数组成:
|> operator_name argument_list
- 您可以将管道运算符添加到任何有效查询的末尾。
- 您可以按任意顺序应用管道运算符,并且可以应用任意次数。
- 管道语法适用于支持标准语法的任何位置:查询、视图、表值函数和其他上下文。
- 在同一查询中,可以将管道语法与标准语法混合使用。例如,子查询可以使用与父级查询不同的语法。
- 管道运算符可以看到管道前表格中存在的每个别名。
- 查询可以以
FROM
子句开头,并且可以选择在FROM
子句后添加管道运算符。
请参考下表:
CREATE OR REPLACE TABLE Produce AS (
SELECT 'apples' AS item, 2 AS sales, 'fruit' AS category
UNION ALL
SELECT 'apples' AS item, 7 AS sales, 'fruit' AS category
UNION ALL
SELECT 'carrots' AS item, 0 AS sales, 'vegetable' AS category
UNION ALL
SELECT 'bananas' AS item, 15 AS sales, 'fruit' AS category);
以下每个查询都包含有效的管道语法,展示了如何顺序构建查询。
查询可以以 FROM
子句开头,且无需包含管道符:
-- View the table.
FROM Produce;
/*---------+-------+-----------+
| item | sales | category |
+---------+-------+-----------+
| apples | 7 | fruit |
| apples | 2 | fruit |
| carrots | 0 | vegetable |
| bananas | 15 | fruit |
+---------+-------+-----------*/
您可以使用 WHERE
管道运算符进行过滤:
-- Filter items with no sales.
FROM Produce
|> WHERE sales > 0;
/*---------+-------+-----------+
| item | sales | category |
+---------+-------+-----------+
| apples | 7 | fruit |
| apples | 2 | fruit |
| bananas | 15 | fruit |
+---------+-------+-----------*/
如需执行汇总,请使用 AGGREGATE
管道运算符,后跟任意数量的汇总函数,后跟 GROUP BY
子句。GROUP BY
子句是 AGGREGATE
管道运算符的一部分,而不是由管道符号 (|>
) 分隔。
-- Compute total sales by item.
FROM Produce
|> WHERE sales > 0
|> AGGREGATE SUM(sales) AS total_sales, COUNT(*) AS num_sales
GROUP BY item;
/*---------+-------------+-----------+
| item | total_sales | num_sales |
+---------+-------------+-----------+
| apples | 9 | 2 |
| bananas | 15 | 1 |
+---------+-------------+-----------*/
现在假设您有以下表格,其中包含每个项的 ID:
CREATE OR REPLACE TABLE ItemData AS (
SELECT 'apples' AS item, '123' AS id
UNION ALL
SELECT 'bananas' AS item, '456' AS id
UNION ALL
SELECT 'carrots' AS item, '789' AS id
);
您可以使用 JOIN
管道运算符将上一个查询的结果与此表联接,以包含每个项的 ID:
FROM Produce
|> WHERE sales > 0
|> AGGREGATE SUM(sales) AS total_sales, COUNT(*) AS num_sales
GROUP BY item
|> JOIN ItemData USING(item);
/*---------+-------------+-----------+-----+
| item | total_sales | num_sales | id |
+---------+-------------+-----------+-----+
| apples | 9 | 2 | 123 |
| bananas | 15 | 1 | 456 |
+---------+-------------+-----------+-----*/
与标准语法的主要区别
管道语法与标准语法有以下不同:
- 查询可以以
FROM
子句开头。 SELECT
管道运算符不会执行汇总。您必须改用AGGREGATE
管道运算符。- 过滤始终使用
WHERE
管道运算符完成,该运算符可应用于任何位置。WHERE
管道运算符(用于替换HAVING
和QUALIFY
)可以过滤聚合函数或窗口函数的结果。
如需了解详情,请参阅管道运算符的完整列表。
使用场景
管道语法的常见用例包括:
- 临时分析和增量查询构建:操作的逻辑顺序有助于更轻松地编写和调试查询。任何查询的前缀(包括管道符
|>
)都是有效查询,可帮助您查看长查询中的中间结果。效率提升有助于加快整个组织的开发流程。 - Log Analytics:Log Analytics 用户中流行的还有其他类型的管道语法。管道语法提供了熟悉的结构,可简化这些用户的和使用体验。
管道语法中的其他功能
除了少数例外情况外,管道语法支持标准语法支持的所有运算符,并且语法相同。此外,管道语法引入了以下管道运算符。
EXTEND
管道运算符
借助 EXTEND
管道运算符,您可以将计算列附加到当前表。EXTEND
管道运算符与 SELECT *, new_column
语句类似,但可让您更灵活地引用列别名。
请考虑下表,其中包含每位学生的两次考试成绩:
CREATE OR REPLACE TABLE Scores AS (
SELECT 'Alex' AS student, 9 AS score1, 10 AS score2, 10 AS points_possible
UNION ALL
SELECT 'Dana' AS student, 5 AS score1, 7 AS score2, 10 AS points_possible);
/*---------+--------+--------+-----------------+
| student | score1 | score2 | points_possible |
+---------+--------+--------+-----------------+
| Alex | 9 | 10 | 10 |
| Dana | 5 | 7 | 10 |
+---------+--------+--------+-----------------*/
假设您要计算每位学生在考试中的平均原始分数和平均百分比分数。在标准语法中,SELECT
语句中的后续列无法看到之前的别名。为避免使用子查询,您必须重复平均值表达式:
SELECT student,
(score1 + score2) / 2 AS average_score,
(score1 + score2) / 2 / points_possible AS average_percent
FROM Scores;
EXTEND
管道运算符可以引用之前使用的别名,使查询更易于阅读且不易出错:
FROM Scores
|> EXTEND (score1 + score2) / 2 AS average_score
|> EXTEND average_score / points_possible AS average_percent
|> SELECT student, average_score, average_percent;
/*---------+---------------+-----------------+
| student | average_score | average_percent |
+---------+---------------+-----------------+
| Alex | 9.5 | .95 |
| Dana | 6.0 | 0.6 |
+---------+---------------+-----------------*/
SET
管道运算符
借助 SET
管道运算符,您可以替换当前表中列的值。SET
管道运算符类似于 SELECT
* REPLACE (expression AS column)
语句。您可以使用表别名限定列名称,从而引用原始值。
FROM (SELECT 3 AS x, 5 AS y)
|> SET x = 2 * x;
/*---+---+
| x | y |
+---+---+
| 6 | 5 |
+---+---*/
DROP
管道运算符
借助 DROP
管道运算符,您可以从当前表中移除列。DROP
管道运算符类似于 SELECT *
EXCEPT(column)
语句。删除列后,您仍然可以通过使用表别名限定列名称来引用原始值。
FROM (SELECT 1 AS x, 2 AS y) AS t
|> DROP x;
/*---+
| y |
+---+
| 2 |
+---*/
RENAME
管道运算符
借助 RENAME
管道运算符,您可以重命名当前表中的列。RENAME
管道运算符类似于 SELECT *
EXCEPT(old_column), old_column AS new_column
语句。
FROM (SELECT 1 AS x, 2 AS y, 3 AS z) AS t
|> RENAME y AS w;
/*---+---+---+
| x | w | z |
+---+---+---+
| 1 | 2 | 3 |
+---+---+---*/
AGGREGATE
管道运算符
如需使用管道语法执行汇总,请使用 AGGREGATE
管道运算符,后跟任意数量的汇总函数,再后跟 GROUP BY
子句。您无需在 SELECT
子句中重复列。
本部分中的示例使用 Produce
表:
CREATE OR REPLACE TABLE Produce AS (
SELECT 'apples' AS item, 2 AS sales, 'fruit' AS category
UNION ALL
SELECT 'apples' AS item, 7 AS sales, 'fruit' AS category
UNION ALL
SELECT 'carrots' AS item, 0 AS sales, 'vegetable' AS category
UNION ALL
SELECT 'bananas' AS item, 15 AS sales, 'fruit' AS category);
/*---------+-------+-----------+
| item | sales | category |
+---------+-------+-----------+
| apples | 7 | fruit |
| apples | 2 | fruit |
| carrots | 0 | vegetable |
| bananas | 15 | fruit |
+---------+-------+-----------*/
FROM Produce
|> AGGREGATE SUM(sales) AS total, COUNT(*) AS num_records
GROUP BY item, category;
/*---------+-----------+-------+-------------+
| item | category | total | num_records |
+---------+-----------+-------+-------------+
| apples | fruit | 9 | 2 |
| carrots | vegetable | 0 | 1 |
| bananas | fruit | 15 | 1 |
+---------+-----------+-------+-------------*/
如果您准备在聚合后立即对结果进行排序,可以在 GROUP BY
子句中使用 ASC
或 DESC
标记要排序的列。未标记的列不会排序。
如果您想对所有列进行排序,可以将 GROUP BY
子句替换为 GROUP AND ORDER BY
子句,后者默认会对每列进行升序排序。您可以在要按降序排序的列后面指定 DESC
。例如,以下三个查询是等效的:
-- Use a separate ORDER BY clause.
FROM Produce
|> AGGREGATE SUM(sales) AS total, COUNT(*) AS num_records
GROUP BY category, item
|> ORDER BY category DESC, item;
-- Explicitly mark how to order columns in the GROUP BY clause.
FROM Produce
|> AGGREGATE SUM(sales) AS total, COUNT(*) AS num_records
GROUP BY category DESC, item ASC;
-- Only mark descending columns in the GROUP AND ORDER BY clause.
FROM Produce
|> AGGREGATE SUM(sales) AS total, COUNT(*) AS num_records
GROUP AND ORDER BY category DESC, item;
使用 GROUP AND ORDER BY
子句的好处在于,您不必在两个位置重复列名称。
如需执行全表聚合,请使用 GROUP BY()
或完全省略 GROUP BY
子句:
FROM Produce
|> AGGREGATE SUM(sales) AS total, COUNT(*) AS num_records;
/*-------+-------------+
| total | num_records |
+-------+-------------+
| 24 | 4 |
+-------+-------------*/
JOIN
管道运算符
借助 JOIN
管道运算符,您可以将当前表与另一个表联接,并且支持标准的联接操作,包括 CROSS
、INNER
、LEFT
、RIGHT
和 FULL
。
以下示例引用了 Produce
和 ItemData
表:
CREATE OR REPLACE TABLE Produce AS (
SELECT 'apples' AS item, 2 AS sales, 'fruit' AS category
UNION ALL
SELECT 'apples' AS item, 7 AS sales, 'fruit' AS category
UNION ALL
SELECT 'carrots' AS item, 0 AS sales, 'vegetable' AS category
UNION ALL
SELECT 'bananas' AS item, 15 AS sales, 'fruit' AS category);
CREATE OR REPLACE TABLE ItemData AS (
SELECT 'apples' AS item, '123' AS id
UNION ALL
SELECT 'bananas' AS item, '456' AS id
UNION ALL
SELECT 'carrots' AS item, '789' AS id
);
以下示例使用 USING
子句,避免了列歧义:
FROM Produce
|> JOIN `ItemData` USING(item)
|> WHERE item = 'apples';
/*--------+-------+----------+-----+
| item | sales | category | id |
+--------+-------+----------+-----+
| apples | 2 | fruit | 123 |
| apples | 7 | fruit | 123 |
+--------+-------+----------+-----*/
如需引用当前表中的列(例如,对 ON
子句中的列进行解析),您需要使用 AS
管道运算符为当前表创建别名。您可以选择为联接表添加别名。您可以在后续的管道运算符后引用这两个别名:
FROM Produce
|> AS produce_table
|> JOIN `ItemData` AS item_table
ON produce_table.item = item_table.item
|> WHERE produce_table.item = 'bananas'
|> SELECT item_table.item, sales, id;
/*---------+-------+-----+
| item | sales | id |
+---------+-------+-----+
| bananas | 15 | 123 |
+---------+-------+-----*/
联接的右侧无法看到联接的左侧,这意味着您无法将当前表与自身联接。例如,以下查询会失败:
-- This query doesn't work.
FROM Produce
|> AS produce_table
|> JOIN produce_table AS produce_table_2 USING(item);
如需与经过修改的表执行自联接,您可以在 WITH
子句中使用通用表表达式 (CTE)。
WITH cte_table AS (
FROM Produce
|> WHERE item = 'carrots'
)
FROM cte_table
|> JOIN cte_table AS cte_table_2 USING(item);
示例
请考虑下表,其中包含有关客户订单的信息:
CREATE OR REPLACE TABLE CustomerOrders AS (
SELECT 1 AS customer_id, 100 AS order_id, 'WA' AS state, 5 AS cost, 'clothing' AS item_type
UNION ALL
SELECT 1 AS customer_id, 101 AS order_id, 'WA' AS state, 20 AS cost, 'clothing' AS item_type
UNION ALL
SELECT 1 AS customer_id, 102 AS order_id, 'WA' AS state, 3 AS cost, 'food' AS item_type
UNION ALL
SELECT 2 AS customer_id, 103 AS order_id, 'NY' AS state, 16 AS cost, 'clothing' AS item_type
UNION ALL
SELECT 2 AS customer_id, 104 AS order_id, 'NY' AS state, 22 AS cost, 'housewares' AS item_type
UNION ALL
SELECT 2 AS customer_id, 104 AS order_id, 'WA' AS state, 45 AS cost, 'clothing' AS item_type
UNION ALL
SELECT 3 AS customer_id, 105 AS order_id, 'MI' AS state, 29 AS cost, 'clothing' AS item_type);
假设您想要了解每个州和商品类型的回头客平均支出金额。您可以按如下方式编写查询:
SELECT state, item_type, AVG(total_cost) AS average
FROM
(
SELECT
SUM(cost) AS total_cost,
customer_id,
state,
item_type,
COUNT(*) OVER (PARTITION BY customer_id) AS num_orders
FROM CustomerOrders
GROUP BY customer_id, state, item_type
QUALIFY num_orders > 1
)
GROUP BY state, item_type
ORDER BY state DESC, item_type ASC;
如果您从上到下读取查询,则会在 total_cost
列被定义之前遇到该列。即使在子查询中,您也需要先读取列的名称,然后才能看到它们来自哪个表。
为了理解此查询,需要从内到外进行阅读。state
和 item_type
列在 SELECT
和 GROUP BY
子句中重复多次,然后在 ORDER BY
子句中再次重复。
以下等效查询使用管道语法编写:
FROM CustomerOrders
|> AGGREGATE SUM(cost) AS total_cost, GROUP BY customer_id, state, item_type
|> EXTEND COUNT(*) OVER (PARTITION BY customer_id) AS num_orders
|> WHERE num_orders > 1
|> AGGREGATE AVG(total_cost) AS average GROUP BY state DESC, item_type ASC;
/*-------+------------+---------+
| state | item_type | average |
+-------+------------+---------+
| WA | clothing | 35.0 |
| WA | food | 3.0 |
| NY | clothing | 16.0 |
| NY | housewares | 22.0 |
+-------+------------+---------*/
借助管道语法,您可以按照自己可能想到的解决原始问题的逻辑步骤编写查询。查询中的语法行对应于以下逻辑步骤:
- 先从客户订单表格开始。
- 按州了解每位客户在每种类型的商品上花了多少钱。
- 统计每个客户的订单数量。
- 将结果限制为回头客。
- 查找回头客在每个州和商品类型上的平均支出金额。
限制
相关资源
- 管道查询语法参考文档
- 标准查询语法参考文档
- 关于管道语法的 VLDB 2024 大会论文