View partition and cluster recommendations
This document describes how the partition and cluster recommender works, and how to view your recommendations and insights.
The BigQuery partitioning and clustering recommender generates partition or cluster recommendations to optimize your BigQuery tables. The recommender analyzes workflows on your BigQuery tables and offers recommendations to better optimize your workflows and query costs using either table partitioning or table clustering.
For more information about the Recommender service, see the Recommender overview.
How the recommender works
To generate a recommendation, the partitioning and clustering recommender uses the project's workload execution data from the past 30 days to analyze each BigQuery table for suboptimal partitioning and clustering configurations. The recommender also uses machine learning to predict how much the workload execution could be optimized with different partitioning or clustering configurations. If the recommender finds that partitioning or clustering a table yields significant savings, the recommender generates a recommendation. The partitioning and clustering recommender generates the following types of recommendations:
Existing table type | Recommendation subtype | Recommendation example |
---|---|---|
Non-partitioned, non-clustered | Partition | "Save about 64 slot hours per month by partitioning on column_C by DAY" |
Non-partitioned, non-clustered | Cluster | "Save about 64 slot hours per month by clustering on column_C" |
Partitioned, non-clustered | Cluster | "Save about 64 slot hours per month by clustering on column_C" |
Each recommendation consists of three parts:
- Guidance to either partition or cluster a specific table
- The specific column in a table to partition or cluster
- Estimated monthly savings for applying the recommendation
To calculate potential workload savings, the recommender assumes that the historical execution workload data from the past 30 days represents the future workload.
The recommender API also returns table workload information in the form of insights. Insights are findings that help you understand your project's workload, providing more context on how a partition or cluster recommendation might improve workload costs.
Limitations
The partitioning and clustering recommender excludes legacy SQL queries in its analysis.
The partitioning and clustering recommender supports resources that are stored in the following regions:
asia-northeast1
,asia-northeast2
,asia-northeast3
,asia-south1
,asia-south2
,asia-southeast1
,asia-southeast2
,asia-east1
,asia-east2
us
,us-west1
,us-west2
,us-west3
,us-west4
,us-central1
,us-central2
,us-east4
,us-east5
eu
,europe-west1
,europe-west4
,europe-west6
,europe-west8
,europe-west9
,europe-southwest1
southamerica-east1
,southamerica-west1
northamerica-northeast1
,northamerica-northeast2
Before you begin
Before you can view partition and cluster recommendations, you need to enable the Recommender API.
Required permissions
To get the permissions that you need to access partition and cluster recommendations,
ask your administrator to grant you the
BigQuery Partitioning Clustering Recommender Viewer (roles/recommender.bigqueryPartitionClusterViewer
) IAM role.
For more information about granting roles, see
Manage access.
This predefined role contains the permissions required to access partition and cluster recommendations. To see the exact permissions that are required, expand the Required permissions section:
Required permissions
The following permissions are required to access partition and cluster recommendations:
-
recommender.bigqueryPartitionClusterRecommendations.get
-
recommender.bigqueryPartitionClusterRecommendations.list
You might also be able to get these permissions with custom roles or other predefined roles.
For more information about IAM roles and permissions in BigQuery, see Introduction to IAM.
View recommendations
This section describes how to view partition and cluster recommendations and insights using the Google Cloud console, the Google Cloud CLI or the Recommender API.
Select one of the following options:
Console
In the Google Cloud console, go to the BigQuery page.
To open the recommendations tab, click > View all recommendations.
RecommendationsThe recommendations tab lists all recommendations available to your project.
In the Optimize BigQuery workload cost panel, click View all.
The cost recommendation table lists all recommendations generated for the current project. For example, the following screenshot shows that the recommender analyzed the
example_table
table, and then recommended clustering theexample_column
column to save an approximate amount of bytes and slots.To see more information about the table insight and recommendation, click on a recommendation.
gcloud
To view partition or cluster recommendations for a specific project, use
the gcloud recommender recommendations list
command:
gcloud recommender recommendations list \ --project=PROJECT_NAME \ --location=REGION_NAME \ --recommender=google.bigquery.table.PartitionClusterRecommender \ --format=FORMAT_TYPE \
Replace the following:
PROJECT_NAME
: the name of the project that contains your BigQuery tableREGION_NAME
: the region that your project is inFORMAT_TYPE
: a supported gcloud CLI output format—for example, JSON
Property | Relevant for subtype | Description |
---|---|---|
recommenderSubtype |
Partition or cluster | Indicates the type of recommendation. |
content.overview.partitionColumn |
Partition | Recommended partitioning column name. |
content.overview.partitionTimeUnit |
Partition | Recommended partitioning time unit. For example, DAY means the
recommendation is to have daily partitions on the recommended column. |
content.overview.clusterColumns |
Cluster | Recommended clustering column names. |
- For more information about other fields in the recommender response, see REST Resource:
projects.locations.recommendersrecommendation
. - For more information about using the Recommender API, see Using the API - Recommendations.
To view table insights using the gcloud CLI, use the
gcloud recommender insights list
command:
gcloud recommender insights list \ --project=PROJECT_NAME \ --location=REGION_NAME \ --insight-type=google.bigquery.table.StatsInsight \ --format=FORMAT_TYPE \
Replace the following:
PROJECT_NAME
: the name of the project that contains your BigQuery tableREGION_NAME
: the region that your project is inFORMAT_TYPE
: a supported gcloud CLI output format—for example, JSON
Property | Relevant for subtype | Description |
---|---|---|
content.existingPartitionColumn |
Cluster | Existing partitioning column, if any |
content.tableSizeTb |
All | Size of the table in terabytes |
content.bytesReadMonthly |
All | Monthly bytes read from the table |
content.slotMsConsumedMonthly |
All | Monthly slot milliseconds consumed by the workload running on the table |
content.queryJobsCountMonthly |
All | Monthly count of jobs running on the table |
- For more information about other fields in the insights response, see REST Resource:
projects.locations.insightTypes.insights
. - For more information about using insights, see Using the API - Insights.
REST API
To view partition or cluster recommendations for a specific project, use the REST API. With each command, you must provide an authentication token, which you can get using the gcloud CLI. For more information about getting an authentication token, see Methods for getting an ID token.
You can use the curl list
request to view all recommendations for a
specific project:
curl -H "Authorization: Bearer $GCLOUD_AUTH_TOKEN" -H "x-goog-user-project: PROJECT_NAME" https://recommender.googleapis.com/v1/projects/my-project/locations/us/recommenders/google.bigquery.table.PartitionClusterRecommender/recommendations
Replace the following:
GCLOUD_AUTH_TOKEN
: the name of a valid gcloud CLI access tokenPROJECT_NAME
: the name of the project containing your BigQuery table
Property | Relevant for subtype | Description |
---|---|---|
recommenderSubtype |
Partition or cluster | Indicates the type of recommendation. |
content.overview.partitionColumn |
Partition | Recommended partitioning column name. |
content.overview.partitionTimeUnit |
Partition | Recommended partitioning time unit. For example, DAY means the
recommendation is to have daily partitions on the recommended column. |
content.overview.clusterColumns |
Cluster | Recommended clustering column names. |
- For more information about other fields in the recommender response, see REST Resource:
projects.locations.recommendersrecommendation
. - For more information about using the Recommender API, see Using the API - Recommendations.
To view table insights using the Google Cloud CLI, run the following command:
curl -H "Authorization: Bearer $GCLOUD_AUTH_TOKEN" -H "x-goog-user-project: PROJECT_NAME" https://recommender.googleapis.com/v1/projects/my-project/locations/us/insightTypes/google.bigquery.table.StatsInsight/insights
Replace the following:
GCLOUD_AUTH_TOKEN
: the name of a valid gcloud CLI access tokenPROJECT_NAME
: the name of the project containing your BigQuery table
Property | Relevant for subtype | Description |
---|---|---|
content.existingPartitionColumn |
Cluster | Existing partitioning column, if any |
content.tableSizeTb |
All | Size of the table in terabytes |
content.bytesReadMonthly |
All | Monthly bytes read from the table |
content.slotMsConsumedMonthly |
All | Monthly slot milliseconds consumed by the workload running on the table |
content.queryJobsCountMonthly |
All | Monthly count of jobs running on the table |
- For more information about other fields in the insights response, see REST Resource:
projects.locations.insightTypes.insights
. - For more information about using insights, see Using the API - Insights.
Troubleshooting
Issue: No recommendations appear for a specific table.
Partition and cluster recommendations might not appear under the following circumstances:
- The table is less than 10GB.
- The table has a high write cost from data manipulation language (DML) operations.
- The table was not read in the past 30 days.
- The estimated monthly savings is too insignificant (less than 1 slot hour of savings).
- The table is already clustered.
Pricing
There is no cost or adverse impact on workload performance when you view recommendations.