在 Cloud Spanner SQL 中,数组是由零个或多个数据类型相同的值组成的有序列表。您可以构建简单数据类型(如 INT64
)的数组和复杂数据类型(如 STRUCT
)的数组。目前,ARRAY
数据类型除外:不支持数组的数组。数组可以包含 NULL
值。
借助 Cloud Spanner SQL,您可以构建数组字面量,使用 ARRAY
函数基于子查询构建数组,还可以使用 ARRAY_AGG
函数将值聚合到数组中。
您可以使用 ARRAY_CONCAT()
等函数合并数组,也可以使用 ARRAY_TO_STRING()
将数组转换为字符串。
构建数组
使用数组字面量
您可以使用英文方括号([
和 ]
)在 Cloud Spanner SQL 中构建数组字面量。数组中的每个元素都用英文逗号分隔。
SELECT [1, 2, 3] as numbers;
SELECT ["apple", "pear", "orange"] as fruit;
SELECT [true, false, true] as booleans;
您也可以基于任何具有兼容类型的表达式创建数组。例如:
SELECT [a, b, c]
FROM
(SELECT 5 AS a,
37 AS b,
406 AS c);
SELECT [a, b, c]
FROM
(SELECT CAST(5 AS INT64) AS a,
CAST(37 AS FLOAT64) AS b,
406 AS c);
注意,第二个示例包含三个表达式:第一个返回 INT64
,第二个返回 FLOAT64
,第三个声明一个字面量。此表达式之所以有效,是因为所有这三个表达式都具有 FLOAT64
这一超类型。
如需为数组声明特定数据类型,请使用英文尖括号(<
和 >
)。例如:
SELECT ARRAY<FLOAT64>[1, 2, 3] as floats;
大多数数据类型(例如 INT64
或 STRING
)的数组不需要事先进行声明。
SELECT [1, 2, 3] as numbers;
您可以使用 ARRAY<type>[]
编写一个特定类型的空数组。您还可以使用 []
编写一个无类型的空数组,在这种情况下,Cloud Spanner SQL 会尝试通过上下文环境推断数组类型。如果 Cloud Spanner SQL 无法推断类型,则会使用默认类型 ARRAY<INT64>
。
使用生成的值
您还可以使用生成的值构造 ARRAY
。
生成整数数组
GENERATE_ARRAY
基于起始值和结束值以及步进值生成值数组。例如,以下查询用于生成一个包含从 11 到 33(含边界值)的所有奇整数的数组:
SELECT GENERATE_ARRAY(11, 33, 2) AS odds;
+--------------------------------------------------+
| odds |
+--------------------------------------------------+
| [11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33] |
+--------------------------------------------------+
您还可以通过提供负步进值按照降序生成值数组:
SELECT GENERATE_ARRAY(21, 14, -1) AS countdown;
+----------------------------------+
| countdown |
+----------------------------------+
| [21, 20, 19, 18, 17, 16, 15, 14] |
+----------------------------------+
生成日期数组
GENERATE_DATE_ARRAY
基于起始和结束 DATE
以及步进 INTERVAL
生成 DATE
数组。
您可以使用 GENERATE_DATE_ARRAY
生成一组 DATE
值。例如,以下查询以 1 WEEK
的间隔返回当前 DATE
和之后的若干 DATE
,直到以后的某一 DATE
(包含该日期):
SELECT
GENERATE_DATE_ARRAY('2017-11-21', '2017-12-31', INTERVAL 1 WEEK)
AS date_array;
+--------------------------------------------------------------------------+
| date_array |
+--------------------------------------------------------------------------+
| [2017-11-21, 2017-11-28, 2017-12-05, 2017-12-12, 2017-12-19, 2017-12-26] |
+--------------------------------------------------------------------------+
访问数组元素
请参考下表 sequences
:
CREATE TABLE sequences (
id INT64 NOT NULL,
some_numbers ARRAY<INT64> NOT NULL
) PRIMARY KEY(id);
假设表中填充了以下行:
+----+---------------------+
| id | some_numbers |
+----+---------------------+
| 1 | [0, 1, 1, 2, 3, 5] |
| 2 | [2, 4, 8, 16, 32] |
| 3 | [5, 10] |
+----+---------------------+
该表包含 ARRAY
数据类型的列 some_numbers
。如需访问此列中数组的元素,您必须指定要使用的索引类型:OFFSET
(从 0 开始的索引)或 ORDINAL
(从 1 开始的索引)。
WITH sequences AS
(SELECT [0, 1, 1, 2, 3, 5] AS some_numbers
UNION ALL SELECT [2, 4, 8, 16, 32] AS some_numbers
UNION ALL SELECT [5, 10] AS some_numbers)
SELECT some_numbers,
some_numbers[OFFSET(1)] AS offset_1,
some_numbers[ORDINAL(1)] AS ordinal_1
FROM sequences;
+--------------------+----------+-----------+
| some_numbers | offset_1 | ordinal_1 |
+--------------------+----------+-----------+
| [0, 1, 1, 2, 3, 5] | 1 | 0 |
| [2, 4, 8, 16, 32] | 4 | 2 |
| [5, 10] | 10 | 5 |
+--------------------+----------+-----------+
以下查询展示了如何使用 OFFSET()
和 ORDINAL()
:
SELECT some_numbers,
some_numbers[OFFSET(1)] AS offset_1,
some_numbers[ORDINAL(1)] AS ordinal_1
FROM sequences;
+---------------+----------+-----------+
| some_numbers | offset_1 | ordinal_1 |
+---------------+----------+-----------+
| [0,1,1,2,3,5] | 1 | 0 |
+---------------+----------+-----------+
| [2,4,8,16,32] | 4 | 2 |
+---------------+----------+-----------+
| [5,10] | 10 | 5 |
+---------------+----------+-----------+
查找长度
ARRAY_LENGTH()
函数用于返回数组的长度。
WITH sequences AS
(SELECT [0, 1, 1, 2, 3, 5] AS some_numbers
UNION ALL SELECT [2, 4, 8, 16, 32] AS some_numbers
UNION ALL SELECT [5, 10] AS some_numbers)
SELECT some_numbers,
ARRAY_LENGTH(some_numbers) AS len
FROM sequences;
+--------------------+--------+
| some_numbers | len |
+--------------------+--------+
| [0, 1, 1, 2, 3, 5] | 6 |
| [2, 4, 8, 16, 32] | 5 |
| [5, 10] | 2 |
+--------------------+--------+
下面是一个查询示例,该示例假设 sequences
表的定义与上文相同,并且具有相同的样本行:
SELECT some_numbers,
ARRAY_LENGTH(some_numbers) AS len
FROM sequences;
+---------------+------+
| some_numbers | len |
+---------------+------+
| [0,1,1,2,3,5] | 6 |
+---------------+------+
| [2,4,8,16,32] | 5 |
+---------------+------+
| [5,10] | 2 |
+---------------+------+
将数组中的元素转换为表中的行
如需将 ARRAY
转换为一组行(即执行展平操作),请使用 UNNEST
运算符。UNNEST
获取一个 ARRAY
,然后返回一个表,ARRAY
中的每个元素均占该表的一行。
由于 UNNEST
破坏了 ARRAY
元素的顺序,您可能需要恢复表中的顺序。为此,请使用可选的 WITH OFFSET
子句返回另一个包含各数组元素偏移量的列,然后使用 ORDER BY
子句按偏移量对行进行排序。
示例
SELECT *
FROM UNNEST(['foo', 'bar', 'baz', 'qux', 'corge', 'garply', 'waldo', 'fred'])
AS element
WITH OFFSET AS offset
ORDER BY offset;
+----------+--------+
| element | offset |
+----------+--------+
| foo | 0 |
| bar | 1 |
| baz | 2 |
| qux | 3 |
| corge | 4 |
| garply | 5 |
| waldo | 6 |
| fred | 7 |
+----------+--------+
如需展平一整列 ARRAY
,同时保留每行中其他列的值,请使用 CROSS JOIN
将带有 ARRAY
列的表联接到该 ARRAY
列的 UNNEST
输出。
这是一种相互关联的交叉联接:UNNEST
运算符引用了源表中每一行的 ARRAY
列,该列之前曾出现在 FROM
子句中。对于源表中的每一行 N
,UNNEST
将行 N
的 ARRAY
展平成一组包含 ARRAY
元素的行,然后 CROSS JOIN
将这组新行与源表的单行 N
联接起来。
示例
下述示例使用 UNNEST
为数组列中的每个元素返回一行。使用 CROSS JOIN
后,id
列显示 sequences
中包含每个数字的行的 id
值。
WITH sequences AS
(SELECT 1 AS id, [0, 1, 1, 2, 3, 5] AS some_numbers
UNION ALL SELECT 2 AS id, [2, 4, 8, 16, 32] AS some_numbers
UNION ALL SELECT 3 AS id, [5, 10] AS some_numbers)
SELECT id, flattened_numbers
FROM sequences
CROSS JOIN UNNEST(sequences.some_numbers) AS flattened_numbers;
+------+-------------------+
| id | flattened_numbers |
+------+-------------------+
| 1 | 0 |
| 1 | 1 |
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
| 1 | 5 |
| 2 | 2 |
| 2 | 4 |
| 2 | 8 |
| 2 | 16 |
| 2 | 32 |
| 3 | 5 |
| 3 | 10 |
+------+-------------------+
请注意,对于相关交叉联接,UNNEST
运算符为可选运算符,CROSS JOIN
可表示为逗号联接。上述示例使用此简写表示法后变为:
WITH sequences AS
(SELECT 1 AS id, [0, 1, 1, 2, 3, 5] AS some_numbers
UNION ALL SELECT 2 AS id, [2, 4, 8, 16, 32] AS some_numbers
UNION ALL SELECT 3 AS id, [5, 10] AS some_numbers)
SELECT id, flattened_numbers
FROM sequences, sequences.some_numbers AS flattened_numbers;
+------+-------------------+
| id | flattened_numbers |
+------+-------------------+
| 1 | 0 |
| 1 | 1 |
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
| 1 | 5 |
| 2 | 2 |
| 2 | 4 |
| 2 | 8 |
| 2 | 16 |
| 2 | 32 |
| 3 | 5 |
| 3 | 10 |
+------+-------------------+
查询嵌套数组
如果一个表包含 STRUCT
的 ARRAY
,则可展平 ARRAY
以查询 STRUCT
的字段。您也可以展平 STRUCT
值的 ARRAY
类型字段。
查询 ARRAY 中的 STRUCT 元素
以下示例将 UNNEST
与 CROSS JOIN
结合使用,以展平 STRUCT
的 ARRAY
。
SELECT race,
participant.name,
participant.splits
FROM
(SELECT "800M" AS race,
[STRUCT("Rudisha" as name, [23.4, 26.3, 26.4, 26.1] AS splits),
STRUCT("Makhloufi" AS name, [24.5, 25.4, 26.6, 26.1] AS splits),
STRUCT("Murphy" AS name, [23.9, 26.0, 27.0, 26.0] AS splits),
STRUCT("Bosse" AS name, [23.6, 26.2, 26.5, 27.1] AS splits),
STRUCT("Rotich" AS name, [24.7, 25.6, 26.9, 26.4] AS splits),
STRUCT("Lewandowski" AS name, [25.0, 25.7, 26.3, 27.2] AS splits),
STRUCT("Kipketer" AS name, [23.2, 26.1, 27.3, 29.4] AS splits),
STRUCT("Berian" AS name, [23.7, 26.1, 27.0, 29.3] as splits)]
AS participants
) AS r
CROSS JOIN UNNEST(r.participants) AS participant;
+------+-------------+-----------------------+
| race | name | splits |
+------+-------------+-----------------------+
| 800M | Rudisha | [23.4,26.3,26.4,26.1] |
+------+-------------+-----------------------+
| 800M | Makhloufi | [24.5,25.4,26.6,26.1] |
+------+-------------+-----------------------+
| 800M | Murphy | [23.9,26,27,26] |
+------+-------------+-----------------------+
| 800M | Bosse | [23.6,26.2,26.5,27.1] |
+------+-------------+-----------------------+
| 800M | Rotich | [24.7,25.6,26.9,26.4] |
+------+-------------+-----------------------+
| 800M | Lewandowski | [25,25.7,26.3,27.2] |
+------+-------------+-----------------------+
| 800M | Kipketer | [23.2,26.1,27.3,29.4] |
+------+-------------+-----------------------+
| 800M | Berian | [23.7,26.1,27,29.3] |
+------+-------------+-----------------------+
您可以从重复字段中找到特定信息。例如,以下查询用于返回 800 米赛跑中跑得最快的参赛选手。
此示例并不涉及数组展平操作,但展示了从重复字段中获取信息的一种常用方法。
示例
WITH races AS (
SELECT "800M" AS race,
[STRUCT("Rudisha" as name, [23.4, 26.3, 26.4, 26.1] as splits),
STRUCT("Makhloufi" as name, [24.5, 25.4, 26.6, 26.1] as splits),
STRUCT("Murphy" as name, [23.9, 26.0, 27.0, 26.0] as splits),
STRUCT("Bosse" as name, [23.6, 26.2, 26.5, 27.1] as splits),
STRUCT("Rotich" as name, [24.7, 25.6, 26.9, 26.4] as splits),
STRUCT("Lewandowski" as name, [25.0, 25.7, 26.3, 27.2] as splits),
STRUCT("Kipketer" as name, [23.2, 26.1, 27.3, 29.4] as splits),
STRUCT("Berian" as name, [23.7, 26.1, 27.0, 29.3] as splits)]
AS participants)
SELECT
race,
(SELECT name
FROM UNNEST(participants)
ORDER BY (
SELECT SUM(duration)
FROM UNNEST(splits) AS duration) ASC
LIMIT 1) AS fastest_racer
FROM races;
+------+---------------+
| race | fastest_racer |
+------+---------------+
| 800M | Rudisha |
+------+---------------+
SELECT race,
(SELECT name
FROM UNNEST(participants)
ORDER BY (
SELECT SUM(duration)
FROM UNNEST(splits) AS duration) ASC
LIMIT 1) AS fastest_racer
FROM
(SELECT "800M" AS race,
[STRUCT("Rudisha" as name, [23.4, 26.3, 26.4, 26.1] AS splits),
STRUCT("Makhloufi" AS name, [24.5, 25.4, 26.6, 26.1] AS splits),
STRUCT("Murphy" AS name, [23.9, 26.0, 27.0, 26.0] AS splits),
STRUCT("Bosse" AS name, [23.6, 26.2, 26.5, 27.1] AS splits),
STRUCT("Rotich" AS name, [24.7, 25.6, 26.9, 26.4] AS splits),
STRUCT("Lewandowski" AS name, [25.0, 25.7, 26.3, 27.2] AS splits),
STRUCT("Kipketer" AS name, [23.2, 26.1, 27.3, 29.4] AS splits),
STRUCT("Berian" AS name, [23.7, 26.1, 27.0, 29.3] as splits)]
AS participants
) AS r;
+------+---------------+
| race | fastest_racer |
+------+---------------+
| 800M | Rudisha |
+------+---------------+
查询 STRUCT 中的 ARRAY 类型字段
您也可以从嵌套的重复字段中获取信息。例如,以下语句用于返回在 800 米赛跑中圈速最快的参赛选手。
WITH races AS (
SELECT "800M" AS race,
[STRUCT("Rudisha" as name, [23.4, 26.3, 26.4, 26.1] as splits),
STRUCT("Makhloufi" as name, [24.5, 25.4, 26.6, 26.1] as splits),
STRUCT("Murphy" as name, [23.9, 26.0, 27.0, 26.0] as splits),
STRUCT("Bosse" as name, [23.6, 26.2, 26.5, 27.1] as splits),
STRUCT("Rotich" as name, [24.7, 25.6, 26.9, 26.4] as splits),
STRUCT("Lewandowski" as name, [25.0, 25.7, 26.3, 27.2] as splits),
STRUCT("Kipketer" as name, [23.2, 26.1, 27.3, 29.4] as splits),
STRUCT("Berian" as name, [23.7, 26.1, 27.0, 29.3] as splits)]
AS participants)
SELECT
race,
(SELECT name
FROM UNNEST(participants),
UNNEST(splits) AS duration
ORDER BY duration ASC LIMIT 1) AS runner_with_fastest_lap
FROM races;
+------+-------------------------+
| race | runner_with_fastest_lap |
+------+-------------------------+
| 800M | Kipketer |
+------+-------------------------+
SELECT race,
(SELECT name
FROM UNNEST(participants),
UNNEST(splits) AS duration
ORDER BY duration ASC LIMIT 1) AS runner_with_fastest_lap
FROM
(SELECT "800M" AS race,
[STRUCT("Rudisha" as name, [23.4, 26.3, 26.4, 26.1] AS splits),
STRUCT("Makhloufi" AS name, [24.5, 25.4, 26.6, 26.1] AS splits),
STRUCT("Murphy" AS name, [23.9, 26.0, 27.0, 26.0] AS splits),
STRUCT("Bosse" AS name, [23.6, 26.2, 26.5, 27.1] AS splits),
STRUCT("Rotich" AS name, [24.7, 25.6, 26.9, 26.4] AS splits),
STRUCT("Lewandowski" AS name, [25.0, 25.7, 26.3, 27.2] AS splits),
STRUCT("Kipketer" AS name, [23.2, 26.1, 27.3, 29.4] AS splits),
STRUCT("Berian" AS name, [23.7, 26.1, 27.0, 29.3] as splits)]
AS participants
) AS r;
+------+-------------------------+
| race | runner_with_fastest_lap |
+------+-------------------------+
| 800M | Kipketer |
+------+-------------------------+
请注意,前面的查询使用英文逗号运算符 (,
) 来执行隐式 CROSS JOIN
。这一做法的效果与以下示例相同,后者使用了显式 CROSS JOIN
。
WITH races AS (
SELECT "800M" AS race,
[STRUCT("Rudisha" as name, [23.4, 26.3, 26.4, 26.1] as splits),
STRUCT("Makhloufi" as name, [24.5, 25.4, 26.6, 26.1] as splits),
STRUCT("Murphy" as name, [23.9, 26.0, 27.0, 26.0] as splits),
STRUCT("Bosse" as name, [23.6, 26.2, 26.5, 27.1] as splits),
STRUCT("Rotich" as name, [24.7, 25.6, 26.9, 26.4] as splits),
STRUCT("Lewandowski" as name, [25.0, 25.7, 26.3, 27.2] as splits),
STRUCT("Kipketer" as name, [23.2, 26.1, 27.3, 29.4] as splits),
STRUCT("Berian" as name, [23.7, 26.1, 27.0, 29.3] as splits)]
AS participants)
SELECT
race,
(SELECT name
FROM UNNEST(participants)
CROSS JOIN UNNEST(splits) AS duration
ORDER BY duration ASC LIMIT 1) AS runner_with_fastest_lap
FROM races;
+------+-------------------------+
| race | runner_with_fastest_lap |
+------+-------------------------+
| 800M | Kipketer |
+------+-------------------------+
SELECT race,
(SELECT name
FROM UNNEST(participants)
CROSS JOIN UNNEST(splits) AS duration
ORDER BY duration ASC LIMIT 1) AS runner_with_fastest_lap
FROM
(SELECT "800M" AS race,
[STRUCT("Rudisha" as name, [23.4, 26.3, 26.4, 26.1] AS splits),
STRUCT("Makhloufi" AS name, [24.5, 25.4, 26.6, 26.1] AS splits),
STRUCT("Murphy" AS name, [23.9, 26.0, 27.0, 26.0] AS splits),
STRUCT("Bosse" AS name, [23.6, 26.2, 26.5, 27.1] AS splits),
STRUCT("Rotich" AS name, [24.7, 25.6, 26.9, 26.4] AS splits),
STRUCT("Lewandowski" AS name, [25.0, 25.7, 26.3, 27.2] AS splits),
STRUCT("Kipketer" AS name, [23.2, 26.1, 27.3, 29.4] AS splits),
STRUCT("Berian" AS name, [23.7, 26.1, 27.0, 29.3] as splits)]
AS participants
) AS r;
+------+-------------------------+
| race | runner_with_fastest_lap |
+------+-------------------------+
| 800M | Kipketer |
+------+-------------------------+
使用 CROSS JOIN
展平数组会排除具有空数组或 NULL 数组的行。如果要包含这些行,请使用 LEFT JOIN
。
WITH races AS (
SELECT "800M" AS race,
[STRUCT("Rudisha" as name, [23.4, 26.3, 26.4, 26.1] as splits),
STRUCT("Makhloufi" as name, [24.5, 25.4, 26.6, 26.1] as splits),
STRUCT("Murphy" as name, [23.9, 26.0, 27.0, 26.0] as splits),
STRUCT("Bosse" as name, [23.6, 26.2, 26.5, 27.1] as splits),
STRUCT("Rotich" as name, [24.7, 25.6, 26.9, 26.4] as splits),
STRUCT("Lewandowski" as name, [25.0, 25.7, 26.3, 27.2] as splits),
STRUCT("Kipketer" as name, [23.2, 26.1, 27.3, 29.4] as splits),
STRUCT("Berian" as name, [23.7, 26.1, 27.0, 29.3] as splits),
STRUCT("Nathan" as name, ARRAY<FLOAT64>[] as splits),
STRUCT("David" as name, NULL as splits)]
AS participants)
SELECT
name, sum(duration) AS finish_time
FROM races, races.participants LEFT JOIN participants.splits duration
GROUP BY name;
+-------------+--------------------+
| name | finish_time |
+-------------+--------------------+
| Murphy | 102.9 |
| Rudisha | 102.19999999999999 |
| David | NULL |
| Rotich | 103.6 |
| Makhloufi | 102.6 |
| Berian | 106.1 |
| Bosse | 103.4 |
| Kipketer | 106 |
| Nathan | NULL |
| Lewandowski | 104.2 |
+-------------+--------------------+
SELECT
name, sum(duration) as duration
FROM
(SELECT "800M" AS race,
[STRUCT("Rudisha" AS name, [23.4, 26.3, 26.4, 26.1] AS splits),
STRUCT("Makhloufi" AS name, [24.5, 25.4, 26.6, 26.1] AS splits),
STRUCT("Murphy" AS name, [23.9, 26.0, 27.0, 26.0] AS splits),
STRUCT("Bosse" AS name, [23.6, 26.2, 26.5, 27.1] AS splits),
STRUCT("Rotich" AS name, [24.7, 25.6, 26.9, 26.4] AS splits),
STRUCT("Lewandowski" AS name, [25.0, 25.7, 26.3, 27.2] AS splits),
STRUCT("Kipketer" AS name, [23.2, 26.1, 27.3, 29.4] AS splits),
STRUCT("Nathan" as name, ARRAY<FLOAT64>[] as splits),
STRUCT("David" as name, NULL as splits)]
AS participants) AS races,
races.participants LEFT JOIN participants.splits duration
GROUP BY name;
+-------------+--------------------+
| name | duration |
+-------------+--------------------+
| Murphy | 102.9 |
| Rudisha | 102.19999999999999 |
| David | NULL |
| Rotich | 103.6 |
| Makhloufi | 102.6 |
| Bosse | 103.4 |
| Kipketer | 106 |
| Nathan | NULL |
| Lewandowski | 104.2 |
+-------------+--------------------+
基于子查询创建数组
使用数组时,一个常见的任务是将子查询结果转换为数组。在 Cloud Spanner SQL 中,您可以使用 ARRAY()
函数来完成此操作。
例如,对 sequences
表执行以下操作:
WITH sequences AS
(SELECT [0, 1, 1, 2, 3, 5] AS some_numbers
UNION ALL SELECT [2, 4, 8, 16, 32] AS some_numbers
UNION ALL SELECT [5, 10] AS some_numbers)
SELECT some_numbers,
ARRAY(SELECT x * 2
FROM UNNEST(some_numbers) AS x) AS doubled
FROM sequences;
+--------------------+---------------------+
| some_numbers | doubled |
+--------------------+---------------------+
| [0, 1, 1, 2, 3, 5] | [0, 2, 2, 4, 6, 10] |
| [2, 4, 8, 16, 32] | [4, 8, 16, 32, 64] |
| [5, 10] | [10, 20] |
+--------------------+---------------------+
SELECT some_numbers,
ARRAY(SELECT x * 2
FROM UNNEST(some_numbers) AS x) AS doubled
FROM sequences;
+---------------+----------------+
| some_numbers | doubled |
+---------------+----------------+
| [0,1,1,2,3,5] | [0,2,2,4,6,10] |
+---------------+----------------+
| [2,4,8,16,32] | [4,8,16,32,64] |
+---------------+----------------+
| [5,10] | [10,20] |
+---------------+----------------+
此示例以一个名为 sequences 的表开始。该表包含一个类型为 ARRAY<INT64>
的列 some_numbers
。
该查询本身包含一个子查询。该子查询会选择 some_numbers
列中的每一行,并使用 UNNEST
将数组返回为一组行。接下来,它将每个值乘以 2,然后使用 ARRAY()
运算符将这些行重新组合到一个数组中。
过滤数组
以下示例在 ARRAY()
运算符的子查询中使用 WHERE
子句,以过滤返回的行。
注意:在下列示例中,返回的结果行未进行排序。
WITH sequences AS
(SELECT [0, 1, 1, 2, 3, 5] AS some_numbers
UNION ALL SELECT [2, 4, 8, 16, 32] AS some_numbers
UNION ALL SELECT [5, 10] AS some_numbers)
SELECT
ARRAY(SELECT x * 2
FROM UNNEST(some_numbers) AS x
WHERE x < 5) AS doubled_less_than_five
FROM sequences;
+------------------------+
| doubled_less_than_five |
+------------------------+
| [0, 2, 2, 4, 6] |
| [4, 8] |
| [] |
+------------------------+
SELECT
ARRAY(SELECT x * 2
FROM UNNEST(some_numbers) AS x
WHERE x < 5) AS doubled_less_than_five
FROM sequences;
+------------------------+
| doubled_less_than_five |
+------------------------+
| [0,2,2,4,6] |
+------------------------+
| [4,8] |
+------------------------+
| [] |
+------------------------+
请注意,第三行包含一个空数组,这是因为与之对应的原始行 ([5, 10]
) 中的元素不符合 x < 5
的过滤条件。
您也可以使用 SELECT DISTINCT
来过滤数组,这样可以仅返回数组中的唯一元素。
WITH sequences AS
(SELECT [0, 1, 1, 2, 3, 5] AS some_numbers)
SELECT ARRAY(SELECT DISTINCT x
FROM UNNEST(some_numbers) AS x) AS unique_numbers
FROM sequences;
+-----------------+
| unique_numbers |
+-----------------+
| [0, 1, 2, 3, 5] |
+-----------------+
SELECT ARRAY(SELECT DISTINCT x
FROM UNNEST(some_numbers) AS x) AS unique_numbers
FROM sequences
WHERE id = 1;
+----------------+
| unique_numbers |
+----------------+
| [0,1,2,3,5] |
+----------------+
您还可以使用 IN
关键字过滤数组行。此关键字可过滤包含数组的行,方法是确定特定值是否与数组中的元素匹配。
WITH sequences AS
(SELECT [0, 1, 1, 2, 3, 5] AS some_numbers
UNION ALL SELECT [2, 4, 8, 16, 32] AS some_numbers
UNION ALL SELECT [5, 10] AS some_numbers)
SELECT
ARRAY(SELECT x
FROM UNNEST(some_numbers) AS x
WHERE 2 IN UNNEST(some_numbers)) AS contains_two
FROM sequences;
+--------------------+
| contains_two |
+--------------------+
| [0, 1, 1, 2, 3, 5] |
| [2, 4, 8, 16, 32] |
| [] |
+--------------------+
SELECT
ARRAY(SELECT x
FROM UNNEST(some_numbers) AS x
WHERE 2 IN UNNEST(some_numbers)) AS contains_two
FROM sequences;
+---------------+
| contains_two |
+---------------+
| [0,1,1,2,3,5] |
+---------------+
| [2,4,8,16,32] |
+---------------+
| [] |
+---------------+
另请注意,第三行包含一个空数组,这是因为与之对应的原始行 ([5, 10]
) 中的数组不包含 2
。
扫描数组
如需检查数组是否包含特定值,请结合使用 IN
运算符和 UNNEST
。如需检查数组是否包含与某条件匹配的值,请结合使用 EXISTS
函数和 UNNEST
。
扫描特定值
如需扫描数组中的特定值,请将 IN
运算符与 UNNEST
结合使用。
示例
以下示例在数组中包含数字 2 时返回 true
。
SELECT 2 IN UNNEST([0, 1, 1, 2, 3, 5]) AS contains_value;
+----------------+
| contains_value |
+----------------+
| true |
+----------------+
对于一个表,要返回数组列中包含特定值的行,请使用 WHERE
子句过滤 IN UNNEST
的结果。
示例
以下示例返回数组列中包含值 2 的行的 id
值。
WITH sequences AS
(SELECT 1 AS id, [0, 1, 1, 2, 3, 5] AS some_numbers
UNION ALL SELECT 2 AS id, [2, 4, 8, 16, 32] AS some_numbers
UNION ALL SELECT 3 AS id, [5, 10] AS some_numbers)
SELECT id AS matching_rows
FROM sequences
WHERE 2 IN UNNEST(sequences.some_numbers)
ORDER BY matching_rows;
+---------------+
| matching_rows |
+---------------+
| 1 |
| 2 |
+---------------+
扫描满足条件的值
如需扫描数组以查找与条件相匹配的值,请使用 UNNEST
返回包含数组中元素的表,使用 WHERE
过滤子查询中的结果表,然后使用 EXISTS
检查过滤后的表是否包含任何行。
示例
以下示例返回包含大于 5 的值的数组列中行的 id
值。
WITH sequences AS
(SELECT 1 AS id, [0, 1, 1, 2, 3, 5] AS some_numbers
UNION ALL SELECT 2 AS id, [2, 4, 8, 16, 32] AS some_numbers
UNION ALL SELECT 3 AS id, [5, 10] AS some_numbers)
SELECT id AS matching_rows FROM sequences
WHERE EXISTS (SELECT *
FROM UNNEST(some_numbers) AS x
WHERE x > 5);
+---------------+
| matching_rows |
+---------------+
| 2 |
| 3 |
+---------------+
扫描满足条件的 STRUCT 字段值
如需搜索 STRUCT
数组中值与条件相匹配的字段,请使用 UNNEST
返回一个每列均与一个 STRUCT
字段相对应的表,然后使用 WHERE EXISTS
从表中过滤掉不匹配的行。
示例
以下示例返回数组列中包含 STRUCT
且其字段 b
的值大于 3 的行。
WITH sequences AS
(SELECT 1 AS id, [STRUCT(0 AS a, 1 AS b)] AS some_numbers
UNION ALL SELECT 2 AS id, [STRUCT(2 AS a, 4 AS b)] AS some_numbers
UNION ALL SELECT 3 AS id, [STRUCT(5 AS a, 3 AS b), STRUCT (7 AS a, 4 AS b)]
AS some_numbers)
SELECT id AS matching_rows
FROM sequences
WHERE EXISTS (SELECT 1
FROM UNNEST(some_numbers)
WHERE b > 3);
+---------------+
| matching_rows |
+---------------+
| 2 |
| 3 |
+---------------+
数组和聚合
借助 Cloud Spanner SQL,您可以使用 ARRAY_AGG()
将值聚合到数组中。
WITH fruits AS
(SELECT "apple" AS fruit
UNION ALL SELECT "pear" AS fruit
UNION ALL SELECT "banana" AS fruit)
SELECT ARRAY_AGG(fruit) AS fruit_basket
FROM fruits;
+-----------------------+
| fruit_basket |
+-----------------------+
| [apple, pear, banana] |
+-----------------------+
请参考下表 fruits
:
CREATE TABLE fruits (
fruit STRING(MAX),
id INT64 NOT NULL
) PRIMARY KEY(id);
假设表中填充了以下数据:
+----+--------------+
| id | fruit |
+----+--------------+
| 1 | "apple" |
| 2 | "pear" |
| 3 | "banana" |
+----+--------------+
以下查询展示了如何使用 ARRAY_AGG()
:
SELECT ARRAY_AGG(fruit) AS fruit_basket
FROM fruits;
+---------------------+
| fruit_basket |
+---------------------+
| [apple,pear,banana] |
+---------------------+
由 ARRAY_AGG()
返回的数组以任意顺序排列,这是因为该函数连接值的顺序无法保证。
您还可以将聚合函数(如 SUM()
)应用于数组中的元素。例如,以下查询返回 sequences
表中每一行的数组元素的总和。
WITH sequences AS
(SELECT [0, 1, 1, 2, 3, 5] AS some_numbers
UNION ALL SELECT [2, 4, 8, 16, 32] AS some_numbers
UNION ALL SELECT [5, 10] AS some_numbers)
SELECT some_numbers,
(SELECT SUM(x)
FROM UNNEST(s.some_numbers) x) AS sums
FROM sequences s;
+--------------------+------+
| some_numbers | sums |
+--------------------+------+
| [0, 1, 1, 2, 3, 5] | 12 |
| [2, 4, 8, 16, 32] | 62 |
| [5, 10] | 15 |
+--------------------+------+
SELECT some_numbers,
(SELECT SUM(x)
FROM UNNEST(s.some_numbers) x) AS sums
FROM sequences s;
+---------------+------+
| some_numbers | sums |
+---------------+------+
| [0,1,1,2,3,5] | 12 |
+---------------+------+
| [2,4,8,16,32] | 62 |
+---------------+------+
| [5,10] | 15 |
+---------------+------+
Cloud Spanner 还支持聚合函数 ARRAY_CONCAT_AGG()
,该函数可以跨行连接数组列的元素。
WITH aggregate_example AS
(SELECT [1,2] AS numbers
UNION ALL SELECT [3,4] AS numbers
UNION ALL SELECT [5, 6] AS numbers)
SELECT ARRAY_CONCAT_AGG(numbers) AS count_to_six_agg
FROM aggregate_example;
+--------------------------------------------------+
| count_to_six_agg |
+--------------------------------------------------+
| [1, 2, 3, 4, 5, 6] |
+--------------------------------------------------+
注意:ARRAY_CONCAT_AGG()
连接值的顺序无法保证,因此该函数返回的数组是不确定的。
将数组转换为字符串
利用 ARRAY_TO_STRING()
函数,您可以将 ARRAY<STRING>
转换为单个 STRING
值,或者将 ARRAY<BYTES>
转换为单个 BYTES
值,其中,生成的值是有序连接的数组元素。
第二个参数是分隔符,相应函数将在输入之间插入该分隔符以生成输出;第二个参数必须与第一个参数的元素具有相同类型。
示例:
WITH greetings AS
(SELECT ["Hello", "World"] AS greeting)
SELECT ARRAY_TO_STRING(greeting, " ") AS greetings
FROM greetings;
+-------------+
| greetings |
+-------------+
| Hello World |
+-------------+
第三个参数(可选)取代输入数组中的 NULL
值。
如果您省略此参数,则该函数将忽略
NULL
数组元素。如果您提供空字符串,则该函数会为
NULL
数组元素插入分隔符。
示例:
SELECT
ARRAY_TO_STRING(arr, ".", "N") AS non_empty_string,
ARRAY_TO_STRING(arr, ".", "") AS empty_string,
ARRAY_TO_STRING(arr, ".") AS omitted
FROM (SELECT ["a", NULL, "b", NULL, "c", NULL] AS arr);
+------------------+--------------+---------+
| non_empty_string | empty_string | omitted |
+------------------+--------------+---------+
| a.N.b.N.c.N | a..b..c. | a.b.c |
+------------------+--------------+---------+
合并数组
在某些情况下,您可能需要将多个数组合并为单个数组。
您可以使用 ARRAY_CONCAT()
函数来完成此操作。
SELECT ARRAY_CONCAT([1, 2], [3, 4], [5, 6]) as count_to_six;
+--------------------------------------------------+
| count_to_six |
+--------------------------------------------------+
| [1, 2, 3, 4, 5, 6] |
+--------------------------------------------------+
构建数组的数组
Cloud Spanner SQL 不支持直接构建数组的数组。正确做法是,您必须创建一个结构体数组,每个结构体包含一个 ARRAY
类型的字段。为便于说明,请参考以下 points
表:
CREATE TABLE points (
point ARRAY<INT64>,
id INT64 NOT NULL
) PRIMARY KEY(id);
假设表中填充了以下行:
+----+----------+
| id | point |
+----+----------+
| 1 | [1, 5] |
| 2 | [2, 8] |
| 3 | [3, 7] |
| 4 | [4, 1] |
| 5 | [5, 7] |
+----+----------+
现在,假设您要创建一个由 points
表中的每个 point
组成的数组。如需完成此操作,请将各行返回的数组封装在一个 STRUCT
中,如下所示。
WITH points AS
(SELECT [1, 5] as point
UNION ALL SELECT [2, 8] as point
UNION ALL SELECT [3, 7] as point
UNION ALL SELECT [4, 1] as point
UNION ALL SELECT [5, 7] as point)
SELECT ARRAY(
SELECT STRUCT(point)
FROM points)
AS coordinates;
+-------------------+
| coordinates |
+-------------------+
| [{point: [1,5]}, |
| {point: [2,8]}, |
| {point: [5,7]}, |
| {point: [3,7]}, |
| {point: [4,1]}] |
+--------------------+
SELECT ARRAY(
SELECT STRUCT(point)
FROM points)
AS coordinates;
+--------------+
| coordinates |
+--------------+
| point: [1,5] |
| point: [2,8] |
| point: [3,7] |
| point: [4,1] |
| point: [5,7] |
+--------------+