在 GoogleSQL for BigQuery 中,数组是由零个或更多相同数据类型的值组成的有序列表。您可以构建简单数据类型(如 INT64
)的数组和复杂数据类型(如 STRUCT
)的数组。目前,ARRAY
数据类型除外,因为不支持数组的数组。如需详细了解 ARRAY
数据类型(包括 NULL
处理),请参阅数组类型。
借助 GoogleSQL,您可以构建数组字面量、使用 ARRAY
函数基于子查询构建数组,还可以使用 ARRAY_AGG
函数将值聚合到数组中。
您可以使用 ARRAY_CONCAT()
等函数合并数组,也可以使用 ARRAY_TO_STRING()
将数组转换为字符串。
访问数组元素
假设有以下名为 Sequences
的模拟表。该表包含 ARRAY
数据类型的列 some_numbers
。
WITH
Sequences AS (
SELECT [0, 1, 1, 2, 3, 5] AS some_numbers UNION ALL
SELECT [2, 4, 8, 16, 32] UNION ALL
SELECT [5, 10]
)
SELECT * FROM Sequences
/*---------------------*
| some_numbers |
+---------------------+
| [0, 1, 1, 2, 3, 5] |
| [2, 4, 8, 16, 32] |
| [5, 10] |
*---------------------*/
如需访问 some_numbers
列中的数组元素,请指定要使用的索引类型:index
或 OFFSET(index)
(用于从 0 开始的索引),或 ORDINAL(index)
(用于从 1 开始的索引)。
例如:
SELECT
some_numbers,
some_numbers[0] AS index_0,
some_numbers[OFFSET(1)] AS offset_1,
some_numbers[ORDINAL(1)] AS ordinal_1
FROM Sequences
/*--------------------+---------+----------+-----------*
| some_numbers | index_0 | offset_1 | ordinal_1 |
+--------------------+---------+----------+-----------+
| [0, 1, 1, 2, 3, 5] | 0 | 1 | 0 |
| [2, 4, 8, 16, 32] | 2 | 4 | 2 |
| [5, 10] | 5 | 10 | 5 |
*--------------------+---------+----------+-----------*/
查找长度
ARRAY_LENGTH
函数用于返回数组的长度。
WITH Sequences AS
(SELECT [0, 1, 1, 2, 3, 5] AS some_numbers
UNION ALL SELECT [2, 4, 8, 16, 32] AS some_numbers
UNION ALL SELECT [5, 10] AS some_numbers)
SELECT some_numbers,
ARRAY_LENGTH(some_numbers) AS len
FROM Sequences;
/*--------------------+--------*
| some_numbers | len |
+--------------------+--------+
| [0, 1, 1, 2, 3, 5] | 6 |
| [2, 4, 8, 16, 32] | 5 |
| [5, 10] | 2 |
*--------------------+--------*/
将数组中的元素转换为表中的行
如需将 ARRAY
转换为一组行(即执行展平操作),请使用 UNNEST
运算符。UNNEST
获取一个 ARRAY
,然后返回一个表,ARRAY
中的每个元素均占该表的一行。
由于 UNNEST
破坏了 ARRAY
元素的顺序,您可能需要恢复表中的顺序。为此,请使用可选的 WITH OFFSET
子句返回另一个包含各数组元素偏移量的列,然后使用 ORDER BY
子句按偏移量对行进行排序。
示例
SELECT *
FROM UNNEST(['foo', 'bar', 'baz', 'qux', 'corge', 'garply', 'waldo', 'fred'])
AS element
WITH OFFSET AS offset
ORDER BY offset;
/*----------+--------*
| element | offset |
+----------+--------+
| foo | 0 |
| bar | 1 |
| baz | 2 |
| qux | 3 |
| corge | 4 |
| garply | 5 |
| waldo | 6 |
| fred | 7 |
*----------+--------*/
如需展平一整列 ARRAY
,同时保留每行中其他列的值,请使用相互关联的 INNER JOIN
将包含 ARRAY
列的表联接到该 ARRAY
列的 UNNEST
输出。
通过相互关联的联接,UNNEST
运算符会引用源表每一行中的 ARRAY
类型列,该列之前曾出现在 FROM
子句中。对于源表中的每一行 N
,UNNEST
将行 N
的 ARRAY
展平成一组包含 ARRAY
元素的行,然后相关 INNER JOIN
或 CROSS JOIN
会将这组新行与源表的单行 N
组合起来。
示例
下述示例使用 UNNEST
为数组列中的每个元素返回一行。使用 INNER JOIN
后,id
列显示 Sequences
中包含每个数字的行的 id
值。
WITH
Sequences AS (
SELECT 1 AS id, [0, 1, 1, 2, 3, 5] AS some_numbers
UNION ALL SELECT 2 AS id, [2, 4, 8, 16, 32] AS some_numbers
UNION ALL SELECT 3 AS id, [5, 10] AS some_numbers
)
SELECT id, flattened_numbers
FROM Sequences
INNER JOIN UNNEST(Sequences.some_numbers) AS flattened_numbers;
/*------+-------------------*
| id | flattened_numbers |
+------+-------------------+
| 1 | 0 |
| 1 | 1 |
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
| 1 | 5 |
| 2 | 2 |
| 2 | 4 |
| 2 | 8 |
| 2 | 16 |
| 2 | 32 |
| 3 | 5 |
| 3 | 10 |
*------+-------------------*/
请注意,对于相关联接,UNNEST
运算符为可选运算符,INNER JOIN
可表示为 CROSS JOIN
或逗号交叉联接。使用英文逗号交叉联接简写表示法,上述示例合并如下:
WITH
Sequences AS (
SELECT 1 AS id, [0, 1, 1, 2, 3, 5] AS some_numbers
UNION ALL SELECT 2 AS id, [2, 4, 8, 16, 32] AS some_numbers
UNION ALL SELECT 3 AS id, [5, 10] AS some_numbers
)
SELECT id, flattened_numbers
FROM Sequences, Sequences.some_numbers AS flattened_numbers;
/*------+-------------------*
| id | flattened_numbers |
+------+-------------------+
| 1 | 0 |
| 1 | 1 |
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
| 1 | 5 |
| 2 | 2 |
| 2 | 4 |
| 2 | 8 |
| 2 | 16 |
| 2 | 32 |
| 3 | 5 |
| 3 | 10 |
*------+-------------------*/
查询嵌套数组
如果一个表包含 STRUCT
的 ARRAY
,则可展平 ARRAY
以查询 STRUCT
的字段。您也可以展平 STRUCT
值的 ARRAY
类型字段。
查询 ARRAY 中的 STRUCT
元素
以下示例将 UNNEST
与 INNER JOIN
结合使用,以展平 STRUCT
的 ARRAY
。
WITH
Races AS (
SELECT
"800M" AS race,
[
STRUCT("Rudisha" AS name, [23.4, 26.3, 26.4, 26.1] AS laps),
STRUCT("Makhloufi" AS name, [24.5, 25.4, 26.6, 26.1] AS laps),
STRUCT("Murphy" AS name, [23.9, 26.0, 27.0, 26.0] AS laps),
STRUCT("Bosse" AS name, [23.6, 26.2, 26.5, 27.1] AS laps),
STRUCT("Rotich" AS name, [24.7, 25.6, 26.9, 26.4] AS laps),
STRUCT("Lewandowski" AS name, [25.0, 25.7, 26.3, 27.2] AS laps),
STRUCT("Kipketer" AS name, [23.2, 26.1, 27.3, 29.4] AS laps),
STRUCT("Berian" AS name, [23.7, 26.1, 27.0, 29.3] AS laps)
] AS participants
)
SELECT
race,
participant
FROM Races AS r
INNER JOIN UNNEST(r.participants) AS participant;
/*------+---------------------------------------*
| race | participant |
+------+---------------------------------------+
| 800M | {Rudisha, [23.4, 26.3, 26.4, 26.1]} |
| 800M | {Makhloufi, [24.5, 25.4, 26.6, 26.1]} |
| 800M | {Murphy, [23.9, 26, 27, 26]} |
| 800M | {Bosse, [23.6, 26.2, 26.5, 27.1]} |
| 800M | {Rotich, [24.7, 25.6, 26.9, 26.4]} |
| 800M | {Lewandowski, [25, 25.7, 26.3, 27.2]} |
| 800M | {Kipketer, [23.2, 26.1, 27.3, 29.4]} |
| 800M | {Berian, [23.7, 26.1, 27, 29.3]} |
*------+---------------------------------------*/
您可以从重复字段中找到特定信息。例如,以下查询用于返回 800 米赛跑中跑得最快的参赛选手。
示例
WITH
Races AS (
SELECT
"800M" AS race,
[
STRUCT("Rudisha" AS name, [23.4, 26.3, 26.4, 26.1] AS laps),
STRUCT("Makhloufi" AS name, [24.5, 25.4, 26.6, 26.1] AS laps),
STRUCT("Murphy" AS name, [23.9, 26.0, 27.0, 26.0] AS laps),
STRUCT("Bosse" AS name, [23.6, 26.2, 26.5, 27.1] AS laps),
STRUCT("Rotich" AS name, [24.7, 25.6, 26.9, 26.4] AS laps),
STRUCT("Lewandowski" AS name, [25.0, 25.7, 26.3, 27.2] AS laps),
STRUCT("Kipketer" AS name, [23.2, 26.1, 27.3, 29.4] AS laps),
STRUCT("Berian" AS name, [23.7, 26.1, 27.0, 29.3] AS laps)
] AS participants
)
SELECT
race,
(
SELECT name
FROM UNNEST(participants)
ORDER BY (SELECT SUM(duration) FROM UNNEST(laps) AS duration) ASC
LIMIT 1
) AS fastest_racer
FROM Races;
/*------+---------------*
| race | fastest_racer |
+------+---------------+
| 800M | Rudisha |
*------+---------------*/
查询 STRUCT 中的 ARRAY
类型字段
您也可以从嵌套的重复字段中获取信息。例如,以下语句用于返回在 800 米赛跑中圈速最快的参赛选手。
WITH
Races AS (
SELECT
"800M" AS race,
[
STRUCT("Rudisha" AS name, [23.4, 26.3, 26.4, 26.1] AS laps),
STRUCT("Makhloufi" AS name, [24.5, 25.4, 26.6, 26.1] AS laps),
STRUCT("Murphy" AS name, [23.9, 26.0, 27.0, 26.0] AS laps),
STRUCT("Bosse" AS name, [23.6, 26.2, 26.5, 27.1] AS laps),
STRUCT("Rotich" AS name, [24.7, 25.6, 26.9, 26.4] AS laps),
STRUCT("Lewandowski" AS name, [25.0, 25.7, 26.3, 27.2] AS laps),
STRUCT("Kipketer" AS name, [23.2, 26.1, 27.3, 29.4] AS laps),
STRUCT("Berian" AS name, [23.7, 26.1, 27.0, 29.3] AS laps)
]AS participants
)
SELECT
race,
(
SELECT name
FROM UNNEST(participants), UNNEST(laps) AS duration
ORDER BY duration ASC
LIMIT 1
) AS runner_with_fastest_lap
FROM Races;
/*------+-------------------------*
| race | runner_with_fastest_lap |
+------+-------------------------+
| 800M | Kipketer |
*------+-------------------------*/
请注意,前面的查询使用英文逗号运算符 (,
) 来执行交叉联接并展平数组。这相当于使用显式 CROSS JOIN
,或使用以下使用显式 INNER JOIN
的示例:
WITH
Races AS (
SELECT "800M" AS race,
[
STRUCT("Rudisha" AS name, [23.4, 26.3, 26.4, 26.1] AS laps),
STRUCT("Makhloufi" AS name, [24.5, 25.4, 26.6, 26.1] AS laps),
STRUCT("Murphy" AS name, [23.9, 26.0, 27.0, 26.0] AS laps),
STRUCT("Bosse" AS name, [23.6, 26.2, 26.5, 27.1] AS laps),
STRUCT("Rotich" AS name, [24.7, 25.6, 26.9, 26.4] AS laps),
STRUCT("Lewandowski" AS name, [25.0, 25.7, 26.3, 27.2] AS laps),
STRUCT("Kipketer" AS name, [23.2, 26.1, 27.3, 29.4] AS laps),
STRUCT("Berian" AS name, [23.7, 26.1, 27.0, 29.3] AS laps)
] AS participants
)
SELECT
race,
(
SELECT name
FROM UNNEST(participants)
INNER JOIN UNNEST(laps) AS duration
ORDER BY duration ASC LIMIT 1
) AS runner_with_fastest_lap
FROM Races;
/*------+-------------------------*
| race | runner_with_fastest_lap |
+------+-------------------------+
| 800M | Kipketer |
*------+-------------------------*/
使用 INNER JOIN
展平数组会排除具有空数组或 NULL 数组的行。如果要包含这些行,请使用 LEFT JOIN
。
WITH
Races AS (
SELECT
"800M" AS race,
[
STRUCT("Rudisha" AS name, [23.4, 26.3, 26.4, 26.1] AS laps),
STRUCT("Makhloufi" AS name, [24.5, 25.4, 26.6, 26.1] AS laps),
STRUCT("Murphy" AS name, [23.9, 26.0, 27.0, 26.0] AS laps),
STRUCT("Bosse" AS name, [23.6, 26.2, 26.5, 27.1] AS laps),
STRUCT("Rotich" AS name, [24.7, 25.6, 26.9, 26.4] AS laps),
STRUCT("Lewandowski" AS name, [25.0, 25.7, 26.3, 27.2] AS laps),
STRUCT("Kipketer" AS name, [23.2, 26.1, 27.3, 29.4] AS laps),
STRUCT("Berian" AS name, [23.7, 26.1, 27.0, 29.3] AS laps),
STRUCT("Nathan" AS name, ARRAY<FLOAT64>[] AS laps),
STRUCT("David" AS name, NULL AS laps)
] AS participants
)
SELECT
Participant.name,
SUM(duration) AS finish_time
FROM Races
INNER JOIN Races.participants AS Participant
LEFT JOIN Participant.laps AS duration
GROUP BY name;
/*-------------+--------------------*
| name | finish_time |
+-------------+--------------------+
| Murphy | 102.9 |
| Rudisha | 102.19999999999999 |
| David | NULL |
| Rotich | 103.6 |
| Makhloufi | 102.6 |
| Berian | 106.1 |
| Bosse | 103.4 |
| Kipketer | 106 |
| Nathan | NULL |
| Lewandowski | 104.2 |
*-------------+--------------------*/
构建数组
您可以使用数组字面量或数组函数构建 ARRAY。如需详细了解如何构建 ARRAY,请参阅 ARRAY 类型。
基于子查询创建数组
使用数组时,一个常见的任务是将子查询结果转换为数组。在 GoogleSQL 中,您可以使用 ARRAY()
函数来实现此目的。
例如,对 Sequences
表执行以下操作:
WITH Sequences AS
(SELECT [0, 1, 1, 2, 3, 5] AS some_numbers
UNION ALL SELECT [2, 4, 8, 16, 32] AS some_numbers
UNION ALL SELECT [5, 10] AS some_numbers)
SELECT some_numbers,
ARRAY(SELECT x * 2
FROM UNNEST(some_numbers) AS x) AS doubled
FROM Sequences;
/*--------------------+---------------------*
| some_numbers | doubled |
+--------------------+---------------------+
| [0, 1, 1, 2, 3, 5] | [0, 2, 2, 4, 6, 10] |
| [2, 4, 8, 16, 32] | [4, 8, 16, 32, 64] |
| [5, 10] | [10, 20] |
*--------------------+---------------------*/
此示例以一个名为 Sequences 的表开始。该表包含一个类型为 ARRAY<INT64>
的列 some_numbers
。
该查询本身包含一个子查询。该子查询会选择 some_numbers
列中的每一行,并使用 UNNEST
将数组返回为一组行。接下来,它将每个值乘以 2,然后使用 ARRAY()
运算符将这些行重新组合到一个数组中。
过滤数组
以下示例在 ARRAY()
运算符的子查询中使用 WHERE
子句,以过滤返回的行。
WITH Sequences AS
(SELECT [0, 1, 1, 2, 3, 5] AS some_numbers
UNION ALL SELECT [2, 4, 8, 16, 32] AS some_numbers
UNION ALL SELECT [5, 10] AS some_numbers)
SELECT
ARRAY(SELECT x * 2
FROM UNNEST(some_numbers) AS x
WHERE x < 5) AS doubled_less_than_five
FROM Sequences;
/*------------------------*
| doubled_less_than_five |
+------------------------+
| [0, 2, 2, 4, 6] |
| [4, 8] |
| [] |
*------------------------*/
请注意,第三行包含一个空数组,这是因为与之对应的原始行 ([5, 10]
) 中的元素不符合 x < 5
的过滤条件。
您也可以使用 SELECT DISTINCT
来过滤数组,这样可以仅返回数组中的唯一元素。
WITH Sequences AS
(SELECT [0, 1, 1, 2, 3, 5] AS some_numbers)
SELECT ARRAY(SELECT DISTINCT x
FROM UNNEST(some_numbers) AS x) AS unique_numbers
FROM Sequences;
/*-----------------*
| unique_numbers |
+-----------------+
| [0, 1, 2, 3, 5] |
*-----------------*/
您还可以使用 IN
关键字过滤数组行。此关键字可过滤包含数组的行,方法是确定特定值是否与数组中的元素匹配。
WITH Sequences AS
(SELECT [0, 1, 1, 2, 3, 5] AS some_numbers
UNION ALL SELECT [2, 4, 8, 16, 32] AS some_numbers
UNION ALL SELECT [5, 10] AS some_numbers)
SELECT
ARRAY(SELECT x
FROM UNNEST(some_numbers) AS x
WHERE 2 IN UNNEST(some_numbers)) AS contains_two
FROM Sequences;
/*--------------------*
| contains_two |
+--------------------+
| [0, 1, 1, 2, 3, 5] |
| [2, 4, 8, 16, 32] |
| [] |
*--------------------*/
另请注意,第三行包含一个空数组,这是因为与之对应的原始行 ([5, 10]
) 中的数组不包含 2
。
扫描数组
如需检查数组是否包含特定值,请结合使用 IN
运算符和 UNNEST
。如需检查数组是否包含与某条件匹配的值,请结合使用 EXISTS
运算符和 UNNEST
。
扫描特定值
如需扫描数组中的特定值,请将 IN
运算符与 UNNEST
结合使用。
示例
以下示例在数组中包含数字 2 时返回 true
。
SELECT 2 IN UNNEST([0, 1, 1, 2, 3, 5]) AS contains_value;
/*----------------*
| contains_value |
+----------------+
| true |
*----------------*/
对于一个表,要返回数组列中包含特定值的行,请使用 WHERE
子句过滤 IN UNNEST
的结果。
示例
以下示例返回数组列中包含值 2 的行的 id
值。
WITH Sequences AS
(SELECT 1 AS id, [0, 1, 1, 2, 3, 5] AS some_numbers
UNION ALL SELECT 2 AS id, [2, 4, 8, 16, 32] AS some_numbers
UNION ALL SELECT 3 AS id, [5, 10] AS some_numbers)
SELECT id AS matching_rows
FROM Sequences
WHERE 2 IN UNNEST(Sequences.some_numbers)
ORDER BY matching_rows;
/*---------------*
| matching_rows |
+---------------+
| 1 |
| 2 |
*---------------*/
扫描满足条件的值
如需扫描数组以查找与条件相匹配的值,请使用 UNNEST
返回包含数组中元素的表,使用 WHERE
过滤子查询中的结果表,然后使用 EXISTS
检查过滤后的表是否包含任何行。
示例
以下示例返回包含大于 5 的值的数组列中行的 id
值。
WITH
Sequences AS (
SELECT 1 AS id, [0, 1, 1, 2, 3, 5] AS some_numbers
UNION ALL
SELECT 2 AS id, [2, 4, 8, 16, 32] AS some_numbers
UNION ALL
SELECT 3 AS id, [5, 10] AS some_numbers
)
SELECT id AS matching_rows
FROM Sequences
WHERE EXISTS(SELECT * FROM UNNEST(some_numbers) AS x WHERE x > 5);
/*---------------*
| matching_rows |
+---------------+
| 2 |
| 3 |
*---------------*/
扫描满足条件的 STRUCT
字段值
如需搜索 STRUCT
数组中值与条件相匹配的字段,请使用 UNNEST
返回一个每列均与一个 STRUCT
字段相对应的表,然后使用 WHERE EXISTS
从表中过滤掉不匹配的行。
示例
以下示例返回数组列中包含 STRUCT
且其字段 b
的值大于 3 的行。
WITH
Sequences AS (
SELECT 1 AS id, [STRUCT(0 AS a, 1 AS b)] AS some_numbers
UNION ALL
SELECT 2 AS id, [STRUCT(2 AS a, 4 AS b)] AS some_numbers
UNION ALL
SELECT 3 AS id, [STRUCT(5 AS a, 3 AS b), STRUCT(7 AS a, 4 AS b)] AS some_numbers
)
SELECT id AS matching_rows
FROM Sequences
WHERE EXISTS(SELECT 1 FROM UNNEST(some_numbers) WHERE b > 3);
/*---------------*
| matching_rows |
+---------------+
| 2 |
| 3 |
*---------------*/
数组和聚合
借助 GoogleSQL,您可以使用 ARRAY_AGG()
将值聚合到数组中。
WITH Fruits AS
(SELECT "apple" AS fruit
UNION ALL SELECT "pear" AS fruit
UNION ALL SELECT "banana" AS fruit)
SELECT ARRAY_AGG(fruit) AS fruit_basket
FROM Fruits;
/*-----------------------*
| fruit_basket |
+-----------------------+
| [apple, pear, banana] |
*-----------------------*/
由 ARRAY_AGG()
返回的数组以任意顺序排列,这是因为该函数连接值的顺序无法保证。如需对数组元素进行排序,请使用 ORDER BY
。例如:
WITH Fruits AS
(SELECT "apple" AS fruit
UNION ALL SELECT "pear" AS fruit
UNION ALL SELECT "banana" AS fruit)
SELECT ARRAY_AGG(fruit ORDER BY fruit) AS fruit_basket
FROM Fruits;
/*-----------------------*
| fruit_basket |
+-----------------------+
| [apple, banana, pear] |
*-----------------------*/
您还可以将聚合函数(如 SUM()
)应用于数组中的元素。例如,以下查询返回 Sequences
表中每一行的数组元素的总和。
WITH Sequences AS
(SELECT [0, 1, 1, 2, 3, 5] AS some_numbers
UNION ALL SELECT [2, 4, 8, 16, 32] AS some_numbers
UNION ALL SELECT [5, 10] AS some_numbers)
SELECT some_numbers,
(SELECT SUM(x)
FROM UNNEST(s.some_numbers) AS x) AS sums
FROM Sequences AS s;
/*--------------------+------*
| some_numbers | sums |
+--------------------+------+
| [0, 1, 1, 2, 3, 5] | 12 |
| [2, 4, 8, 16, 32] | 62 |
| [5, 10] | 15 |
*--------------------+------*/
GoogleSQL 还支持聚合函数 ARRAY_CONCAT_AGG()
,该函数可以跨行连接数组列的元素。
WITH Aggregates AS
(SELECT [1,2] AS numbers
UNION ALL SELECT [3,4] AS numbers
UNION ALL SELECT [5, 6] AS numbers)
SELECT ARRAY_CONCAT_AGG(numbers) AS count_to_six_agg
FROM Aggregates;
/*--------------------------------------------------*
| count_to_six_agg |
+--------------------------------------------------+
| [1, 2, 3, 4, 5, 6] |
*--------------------------------------------------*/
将数组转换为字符串
利用 ARRAY_TO_STRING()
函数,您可以将 ARRAY<STRING>
转换为单个 STRING
值,或者将 ARRAY<BYTES>
转换为单个 BYTES
值,其中,生成的值是有序连接的数组元素。
第二个参数是分隔符,相应函数将在输入之间插入该分隔符以生成输出;第二个参数必须与第一个参数的元素具有相同类型。
示例:
WITH Words AS
(SELECT ["Hello", "World"] AS greeting)
SELECT ARRAY_TO_STRING(greeting, " ") AS greetings
FROM Words;
/*-------------*
| greetings |
+-------------+
| Hello World |
*-------------*/
第三个参数(可选)取代输入数组中的 NULL
值。
如果您省略此参数,则该函数将忽略
NULL
数组元素。如果您提供空字符串,则该函数会为
NULL
数组元素插入分隔符。
示例:
SELECT
ARRAY_TO_STRING(arr, ".", "N") AS non_empty_string,
ARRAY_TO_STRING(arr, ".", "") AS empty_string,
ARRAY_TO_STRING(arr, ".") AS omitted
FROM (SELECT ["a", NULL, "b", NULL, "c", NULL] AS arr);
/*------------------+--------------+---------*
| non_empty_string | empty_string | omitted |
+------------------+--------------+---------+
| a.N.b.N.c.N | a..b..c. | a.b.c |
*------------------+--------------+---------*/
合并数组
在某些情况下,您可能需要将多个数组合并为单个数组。
您可以使用 ARRAY_CONCAT()
函数来完成此操作。
SELECT ARRAY_CONCAT([1, 2], [3, 4], [5, 6]) AS count_to_six;
/*--------------------------------------------------*
| count_to_six |
+--------------------------------------------------+
| [1, 2, 3, 4, 5, 6] |
*--------------------------------------------------*/
更新数组
假设以下名为 arrays_table
的表。表格中的第一列是整数数组,而第二列包含两个嵌套的整数数组。
WITH arrays_table AS (
SELECT
[1, 2] AS regular_array,
STRUCT([10, 20] AS first_array, [100, 200] AS second_array) AS nested_arrays
UNION ALL SELECT
[3, 4] AS regular_array,
STRUCT([30, 40] AS first_array, [300, 400] AS second_array) AS nested_arrays
)
SELECT * FROM arrays_table;
/*---------------*---------------------------*----------------------------*
| regular_array | nested_arrays.first_array | nested_arrays.second_array |
+---------------+---------------------------+----------------------------+
| [1, 2] | [10, 20] | [100, 200] |
| [3, 4] | [30, 40] | [130, 400] |
*---------------*---------------------------*----------------------------*/
您可以使用 UPDATE
语句更新表中的数组。以下示例将数字 5 插入 regular_array
列中,并将 nested_arrays
列的 first_array
字段中的元素插入 second_array
字段中:
UPDATE
arrays_table
SET
regular_array = ARRAY_CONCAT(regular_array, [5]),
nested_arrays.second_array = ARRAY_CONCAT(nested_arrays.second_array,
nested_arrays.first_array)
WHERE TRUE;
SELECT * FROM arrays_table;
/*---------------*---------------------------*----------------------------*
| regular_array | nested_arrays.first_array | nested_arrays.second_array |
+---------------+---------------------------+----------------------------+
| [1, 2, 5] | [10, 20] | [100, 200, 10, 20] |
| [3, 4, 5] | [30, 40] | [130, 400, 30, 40] |
*---------------*---------------------------*----------------------------*/
压缩数组
假定两个大小相同的数组,您可以将其合并到一个数组中,该数组由输入数组中的元素对(取自其相应的位置)组成。此操作有时称为压缩。
您可以使用 UNNEST
和 WITH OFFSET
压缩数组。在此示例中,每个值对都以 STRUCT
形式存储在数组中。
WITH
Combinations AS (
SELECT
['a', 'b'] AS letters,
[1, 2, 3] AS numbers
)
SELECT
ARRAY(
SELECT AS STRUCT
letters[SAFE_OFFSET(index)] AS letter,
numbers[SAFE_OFFSET(index)] AS number
FROM Combinations
INNER JOIN
UNNEST(
GENERATE_ARRAY(
0,
LEAST(ARRAY_LENGTH(letters), ARRAY_LENGTH(numbers)) - 1)) AS index
ORDER BY index
) AS pairs;
/*------------------------------*
| pairs |
+------------------------------+
| [{ letter: "a", number: 1 }, |
| { letter: "b", number: 2 }] |
*------------------------------*/
您可以使用长度不同的输入数组,只要第一个数组的长度等于或小于第二个数组的长度。压缩后的数组将是最短输入数组的长度。
要获取一个包含所有元素的压缩数组(即使输入数组的长度不同),请将 LEAST
更改为 GREATEST
。如果一个数组中的元素没有另一个数组中的关联元素,则这些元素将与 NULL
配对。
WITH
Combinations AS (
SELECT
['a', 'b'] AS letters,
[1, 2, 3] AS numbers
)
SELECT
ARRAY(
SELECT AS STRUCT
letters[SAFE_OFFSET(index)] AS letter,
numbers[SAFE_OFFSET(index)] AS number
FROM Combinations
INNER JOIN
UNNEST(
GENERATE_ARRAY(
0,
GREATEST(ARRAY_LENGTH(letters), ARRAY_LENGTH(numbers)) - 1)) AS index
ORDER BY index
) AS pairs;
/*-------------------------------*
| pairs |
+-------------------------------+
| [{ letter: "a", number: 1 }, |
| { letter: "b", number: 2 }, |
| { letter: null, number: 3 }] |
*-------------------------------*/
构建数组的数组
GoogleSQL 不支持直接构建数组的数组。正确做法是,您必须创建一个结构体数组,每个结构体包含一个 ARRAY
类型的字段。为便于说明,请参考以下 Points
表:
/*----------*
| point |
+----------+
| [1, 5] |
| [2, 8] |
| [3, 7] |
| [4, 1] |
| [5, 7] |
*----------*/
现在,假设您要创建一个由 Points
表中的每个 point
组成的数组。如需完成此操作,请将各行返回的数组封装在一个 STRUCT
中,如下所示。
WITH Points AS
(SELECT [1, 5] AS point
UNION ALL SELECT [2, 8] AS point
UNION ALL SELECT [3, 7] AS point
UNION ALL SELECT [4, 1] AS point
UNION ALL SELECT [5, 7] AS point)
SELECT ARRAY(
SELECT STRUCT(point)
FROM Points)
AS coordinates;
/*-------------------*
| coordinates |
+-------------------+
| [{point: [1,5]}, |
| {point: [2,8]}, |
| {point: [5,7]}, |
| {point: [3,7]}, |
| {point: [4,1]}] |
*-------------------*/