Manage data preparations
This document describes how to manage data preparations in BigQuery, including how to deploy and schedule data preparations. Data preparations are BigQuery resources powered by Dataform.
Before you begin
- Ensure you have enabled the Gemini in BigQuery API.
Required roles
To ensure that the Dataform service account has the necessary permissions to prepare data in BigQuery, see the required roles for Dataform service accounts.
To get the permissions that you need to prepare data in BigQuery, ask your administrator to grant you the following IAM roles on the project:
-
BigQuery Data Editor (
roles/bigquery.dataEditor
) -
Service Usage Consumer (
roles/serviceusage.serviceUsageConsumer
)
For more information about granting roles, see Manage access to projects, folders, and organizations.
You might also be able to get the required permissions through custom roles or other predefined roles.
View existing data preparations
To view a list of existing data preparations, follow these steps:
- On the BigQuery Studio page, go to the Explorer pane.
- Expand your project.
- Expand the Data preparations list.
Schedule data preparations
You can create schedules in the data preparation editor and manage schedules in the BigQuery Orchestration page.
Create a schedule
To create a schedule that executes the data preparation steps and loads the prepared data into the destination table, schedule a one-time or a recurring data preparation run:
- From the data preparation toolbar, click Schedule.
- Enter a schedule name.
- Enter the service account name associated with the execution.
- Schedule a frequency.
- Click Create schedule.
View schedules
To view all data preparation schedules in your project, follow these steps:
In the Google Cloud console, go to the Orchestration page.
Optional: To view details of a selected schedule and its past runs, click the name of the schedule.
Delete a schedule
To permanently delete a schedule for a selected data preparation, follow these steps:
In the Google Cloud console, go to the Orchestration page.
In the row that contains the schedule, click > Delete.
Actions
Optimize data preparation by incrementally processing data
To configure the way your prepared data is written into a destination table, follow these steps.
In the Google Cloud console, go to the BigQuery Studio page.
In the Activity pane, select your data preparation.
In the toolbar of your data preparation, select More > Write mode.
Select one of the options. For more information, see Write mode.
Click Save.
Help improve suggestions
You can help improve Gemini suggestions by sharing with Google the prompt data that you submit to features in Preview. To share your prompt data, follow these steps:
- Open the data preparation editor in BigQuery.
- In the data preparation toolbar, click settings More.
- Select Share data to improve Gemini in BigQuery.
Data sharing settings apply to the entire project and can only be set by a
project administrator with the serviceusage.services.enable
and
serviceusage.services.list
IAM permissions. For more
information about data use in the Trusted Tester Program, see
Gemini for Google Cloud Trusted Tester Program.
Data preparation versions
BigQuery data preparations don't support viewing, comparing, or restoring data preparation versions.
For a list of data preparation versions in chronological order, follow these steps:
- On the BigQuery Studio page, go to the Explorer pane.
- Select your data preparation. Versions are listed on the Activity tab in the Explorer pane.
Download a data preparation
To download a data preparation in a YAML file, follow these steps:
In the Google Cloud console, go to the BigQuery Studio page.
In the Explorer pane, expand your project and the Data preparations folder. Click the name of the data preparation that you want to download.
Click Download. The data preparation is saved in the YAML file format—for example,
NAME data preparation.dp.yaml
.
Upload a data preparation
To upload a data preparation from a YAML file, follow these steps:
In the Google Cloud console, go to the BigQuery Studio page.
In the Explorer pane, expand your project.
Go to the Data preparations folder and click more_vert Menu > Upload to Data preparation.
In the Upload data preparation dialog, select a file to upload, or enter the URL of the data preparation.
Enter a name for the data preparation.
Select a data preparation location where resources are managed and stored.
Click Upload.
What's next
- Learn more about preparing data in BigQuery.
- Learn how to create data preparations.