本文档介绍如何向 Dataform 核心 SQLX 文件添加表及其列和记录的说明。
您可以向 Dataform 中的所有表类型(表、增量表和视图)添加表、列和记录说明。
您可能需要记录以下内容:
- 表格的用途。
- 表中列或记录的内容或角色。
- SQL 工作流的表与其他对象(例如依赖于当前表的表或视图)的关系。
- 应用于表的断言。
- 应用于表的前后操作或操作后操作。
- 表的所有者,即创建该表的用户。如果有多个团队成员处理同一个工作流,这样做可能很有用。
准备工作
在开始之前,请先创建表格。
所需的角色
如需获取记录表所需的权限,请让管理员向您授予工作区的 Dataform Editor (roles/dataform.editor
) IAM 角色。如需详细了解如何授予角色,请参阅管理访问权限。
添加表说明
如需为 SQLX 文件中的表添加说明,请按以下步骤操作:
在 Cloud 控制台中,转到 Dataform 页面。
选择一个代码库。
选择开发工作区。
在 Files 窗格中,点击要修改的表定义 SQLX 文件。
在文件的
config
块中,按以下格式输入表说明:description: "Description of the table",
可选:点击格式。
以下代码示例展示了添加到 SQLX 表定义文件的 config
块中的表说明:
config {
type: "table",
description: "Description of the table",
}
添加列和记录说明
如需将各个列和记录的说明添加到 SQLX 文件,请按以下步骤操作:
- 在表定义文件的
config
代码块中,输入columns: {}
。 在
columns: {}
中,按以下格式输入列说明:column_name: "Description of the column",
在
columns: {}
中,按以下格式输入记录说明:record_name: { description: "Description of the record", columns: { record_column_name: "Description of the record column" } }
可选:点击格式。
以下代码示例显示了 SQLX 表定义文件的 config
块中的表、列和记录说明:
config {
type: "table",
description: "Description of the table.",
columns: {
column1_name: "Description of the first column",
column2_name: "Description of the second column",
column3_name: "Description of the third column",
record_name: {
description: "Description of the record.",
columns: {
record_column1_name: "Description of the first record column",
record_column2_name: "Description of the second record column",
}
}
}
}
SELECT
"first_column_value" AS column_1_name,
"second_column_value" AS column_2_name,
"third_column_value" AS column_3_name,
STRUCT("first" AS record_column1_name,
"second" AS record_column2_name) AS record_name
通过包含内容在 Dataform 中重复使用列文档
您可以通过 JavaScript 包含,在整个 SQL 工作流中重复使用列的说明。如果您在 SQL 工作流中有多个名称和说明相同的列,则可能需要重复使用列文档。
- 如需创建可重复使用的列说明,请定义一个 JavaScript include 常量,并添加列名称及其说明。
您可以使用对单个列的说明来定义常量,也可以定义包含集或列说明的常量,以重复使用表中所有列的说明。如需详细了解如何在 Dataform 中创建和使用 include ,请参阅在 Dataform 中通过 include 重复使用变量和函数。
以下代码示例展示了多个常量,并带有对 includes/docs.js
JavaScript 文件中定义的各个列的说明:
// filename is includes/docs.js
const user_id = `A unique identifier for a user`;
const age = `The age of a user`;
const creation_date = `The date this user signed up`;
const user_tenure = `The number of years since the user's creation date`;
const badge_count = `The all-time number of badges the user has received`;
const questions_and_answer_count = `The all-time number of questions and answers the user has created`;
const question_count = `The all-time number of questions the user has created`;
const answer_count = `The all-time number of answers the user has created`;
const last_badge_received_at = `The time the user received their most recent badge`;
const last_posted_at = `The time the user last posted a question or answer`;
const last_question_posted_at = `The time the user last posted an answer`;
const last_answer_posted_at = `The time the user last posted a question`;
module.exports = {
user_id,
age,
creation_date,
user_tenure,
badge_count,
questions_and_answer_count,
question_count,
answer_count,
last_badge_received_at,
last_posted_at,
last_question_posted_at,
last_answer_posted_at,
};
以下代码示例显示了 includes/docs.js
中定义的 user_id
和 age
常量,它们在 definitions/my_table.sqlx
SQLX 表定义文件中用于为表中的选定列生成文档:
config {
type: "table",
description: "Table description.",
columns: {
user_id: docs.user_id,
column2_name: "Description of the second column",
column3_name: "Description of the third column",
age: docs.age,
}
}
SELECT ...
以下代码示例展示了一个常量,该常量包含 includes/docs.js
JavaScript 文件中定义的一组列说明:
// filename is includes/docs.js
const columns = {
user_id = `A unique identifier for a user`,
age = `The age of a user`,
creation_date = `The date this user signed up`,
user_tenure = `The number of years since the user's creation date`,
badge_count = `The all-time number of badges the user has received`,
questions_and_answer_count = `The all-time number of questions and answers the user has created`,
question_count = `The all-time number of questions the user has created`,
answer_count = `The all-time number of answers the user has created`,
last_badge_received_at = `The time the user received their most recent badge`,
last_posted_at = `The time the user last posted a question or answer`,
last_question_posted_at = `The time the user last posted an answer`,
last_answer_posted_at = `The time the user last posted a question`,
}
module.exports = {
columns
};
以下代码示例显示了 includes/table_docs.js
中定义的 columns
常量,该常量在 definitions/my_table.sqlx
SQLX 表定义文件中用于为表中的所有列生成文档:
config { type: "table",
description: "My table description",
columns: docs.columns
}
SELECT 1 AS one
后续步骤
- 如需了解如何在 Dataform 中创建和使用包含项,请参阅在 Dataform 中通过包含项重复使用变量和函数。
- 如需了解如何配置表分区和集群,请参阅创建表分区和集群。
- 如需了解如何使用断言测试表数据,请参阅使用断言测试表。