Export recommendations to BigQuery

Overview

With the BigQuery export, you can view daily snapshots of recommendations for your organization. This is done using the BigQuery Data Transfer Service. See this doc to see which recommenders are included in BigQuery Export today.

Before you begin

Complete the following steps before you create a data transfer for recommendations:

  • Allow the BigQuery Data Transfer Service permission to manage your data transfer. If you use the BigQuery web UI to create the transfer, you must allow pop-ups from console.cloud.google.com on your browser to be able to view the permissions. For more details, see enable a BigQuery Data Transfer Service.
  • Create a BigQuery dataset to store data.
    • The data transfer uses the same region where the dataset is created. The location is immutable once the dataset and transfer are created.
    • The dataset will contain insights and recommendations from all regions across the globe. Thus this operation would aggregate all those data into a global region during the process. Please refer to Google Cloud Customer Care if there are any data residency concerns.
    • If the dataset location is newly launched, there may be a delay in initial export data availability.

Pricing

Exporting recommendations to BigQuery is available to all Recommender customers based on their Recommender pricing tier.

Required permissions

While setting up the data transfer, you require the following permissions at the project level where you create a data transfer:

  • bigquery.transfers.update - Allows you to create the transfer
  • bigquery.datasets.update - Allows you to update actions on the target dataset
  • resourcemanager.projects.update - Allows you to select a project where you'd like the export data to be stored
  • pubsub.topics.list - Allows you to select a Pub/Sub topic in order to receive notifications about your export

The following permission is required at the organization level. This organization corresponds to the one that the export is being set up for.

  • recommender.resources.export - Allows you to export recommendations to BigQuery

The following permissions are required to export negotiated prices for cost savings recommendations:

  • billing.resourceCosts.get at project level - Allows exporting negotiated prices for project level recommendations
  • billing.accounts.getSpendingInformation at billing account level - Allows exporting negotiated prices for billing account level recommendations

Without these permissions, cost savings recommendations will be exported with standard prices instead of negotiated prices.

Grant permissions

The following roles have to be granted on the project where you create the data transfer:

    To allow you to create a transfer and update actions on the target dataset, you must grant the following role:

  • BigQuery admin role - roles/bigquery.admin
  • There are multiple roles that contain permissions to select a project for storing your export data and for selecting a Pub/Sub topic to receive notifications. To have both these permissions available, you can grant the following role:

  • Project owner role - roles/owner
  • There are multiple roles that contain the permission billing.resourceCosts.get to export negotiated prices for cost savings project level recommendations - you can grant any one of them:

  • Project Owner role - roles/owner
  • Project Viewer role - roles/viewer
  • Project Editor role - roles/editor
  • There are multiple roles that contain the permission billing.accounts.getSpendingInformation to export negotiated prices for cost savings billing account level recommendations - you can grant any one of them:

  • Billing Account Administrator role - roles/billing.admin
  • Billing Account Costs Manager role - roles/billing.costsManager
  • Billing Account Viewer role - roles/billing.viewer

You must grant the following role at the organization level:

  • Recommendations Exporter (roles/recommender.exporter) role on the Google Cloud console.

You can also create Custom roles containing the required permissions.

Create a data transfer for recommendations

  1. Sign in to Google Cloud console.

    Sign in to Google Cloud console

  2. From the Home screen, click the Recommendations tab.

  3. Click Export to view the BigQuery export form.

  4. Select a Destination Project to store the recommendation data and click Next.

  5. Click Enable APIs to enable the BigQuery APIs for the export. This can take a several seconds to complete. Once done, click Continue.

  6. On the Configure Transfer form, provide the following details:

    • In the Transfer config name section, for Display name, enter a name for the transfer. The transfer name can be any value that allows you to easily identify the transfer if you need to modify it later.

    • In the Schedule options section, for Schedule, leave the default value (Start now) or click Start at a set time.

      • For Repeats, choose an option for how often to run the transfer.

        • Daily (default)
        • Weekly
        • Monthly
        • Custom
        • On-demand
      • For Start date and run time, enter the date and time to start the transfer. If you choose Start now, this option is disabled.

    • In the Destination settings section, for Destination dataset, choose the dataset ID you created to store your data.

    • In the Data source details section:

      • The default value for organization_id is the organization that you are currently viewing recommendations for. If you want to export recommendations to another organization, you can change this on top of the console in the organization viewer.

    • (Optional) In the Notification options section:

      • Click the toggle to enable email notifications. When you enable this option, the transfer administrator receives an email notification when a transfer run fails.
      • For Select a Pub/Sub topic, choose your topic name or click Create a topic. This option configures Pub/Sub run notifications for your transfer.

  7. Click Create to create the transfer.

  8. Click Allow on the consent pop-up.

  9. Once the transfer is created, you are directed back to the Recommendation Hub. You can click the link to access the transfer configuration details. Alternatively, you can access the transfers by doing the following:

    • Go to the BigQuery page in the Google Cloud console.

      Go to the BigQuery page

    • Click Data Transfers. You can view all the available data transfers.

View the run history for a transfer

To view the run history for a transfer, do the following:

  1. Go to the BigQuery page in the Google Cloud console.

    Go to the BigQuery page

  2. Click Data Transfers. You can view all the available data transfers.

  3. Click the appropriate transfer in the list.

  4. In the list of run transfers displayed under the RUN HISTORY tab, select the transfer you want to view the details for.

  5. The Run details panel is displayed for the individual run transfer you selected. Some of the possible run details displayed are:

    • Deferred transfer due to unavailable source data.
    • Job indicating the number of rows exported to a table
    • Missing permissions for a datasource that you must grant and later schedule a backfill.

When does your data get exported?

When you create a data transfer, the first export occurs in two days. After the first export, the export jobs run at the cadence you have requested at set up time. The following conditions apply:

  • The export job for a specific day (D) exports the end-of-day's (D) data to your BigQuery dataset, which usually finishes by the end-of-next-day (D+1). The export job runs in PST time zone and may appear to have additional delay for other time zones.

  • The daily export job does not run until all the data for export is available. This can result in variations, and sometimes delays, in the day and time that your dataset is updated. Therefore, it is best to use the latest available snapshot of data rather than having a hard time sensitive dependency on specific dated tables.

Common status messages on an export

Learn about common status messages you can see exporting recommendations to BigQuery.

User does not have required permission

The following message occurs when user does not have required permission recommender.resources.export. You will see the following message:

User does not have required permission "recommender.resources.export". Please, obtain the required permissions for the datasource and try again by triggering a backfill for this date

To resolve this issue grant the IAM role roles/recommender.exporter to user/service account setting up the export at organizational level for the organization for which the export was set up for. It can be given through the gcloud commands below:

  • In case of User:

    gcloud organizations add-iam-policy-binding *<organization_id>* --member='user:*<user_name>*' --role='roles/recommender.exporter'
    
  • In case of Service Account:

    gcloud organizations add-iam-policy-binding *<organization_id>* --member='serviceAccount:*<service_acct_name>*' --role='roles/recommender.exporter'
    

Transfer deferred due to source data not being available

The following message occurs when the transfer is rescheduled because the source data is not yet available. This is not an error. It means the export pipelines have not completed yet for the day. The transfer will re-run at the new scheduled time and will succeed once the export pipelines have completed. You will see the following message:

Transfer deferred due to source data not being available

Source data not found

The following message occurs when F1toPlacer pipelines completed, but no recommendations or insights were found for the organization that the export was set up for. You will see the following message:

Source data not found for 'recommendations_export$<date>'insights_export$<date>

This message occurs due to following reasons:

  • The user set up the export less than 2 days ago. The customer guide lets customers know that there is a day's delay before their export will be available.
  • There are no recommendations or insights available for their organization for the specific day. This could be the actual case or the pipelines may have run before all recommendations or insights were available for the day.

View tables for a transfer

When you export recommendations to BigQuery, the dataset contains two tables that are partitioned by date:

  • recommendations_export
  • insight_export

For more details on tables and schema, see Creating and using tables and Specifying a schema.

To view the tables for a data transfer, do the following:

  1. Go to the BigQuery page in the Google Cloud console. Go to the BigQuery page

  2. Click Data Transfers. You can view all the available data transfers.

  3. Click the appropriate transfer in the list.

  4. Click the CONFIGURATION tab and click the dataset.

  5. In the Explorer panel, expand your project and select a dataset. The description and details appear in the details panel. The tables for a dataset are listed with the dataset name in the Explorer panel.

Schedule a backfill

Recommendations for a date in the past (this date being later than the date when the organization was opted in for the export) can be exported by scheduling a backfill. To schedule a backfill, do the following:

  1. Go to the BigQuery page in the Google Cloud console.

    Go to the BigQuery page

  2. Click Data Transfers.

  3. On the Transfers page, click on an appropriate transfer in the list.

    1. Click Schedule backfill.

    2. In the Schedule backfill dialog, choose your Start date and End date.

For more information on working with transfers, see Working with transfers.

Example queries

You can use the following sample queries to analyze your exported data.

Viewing cost savings for recommendations where the recommendation duration is displayed in days

SELECT name, recommender, target_resources,
  case primary_impact.cost_projection.cost.units is null
       when true then round(primary_impact.cost_projection.cost.nanos * power(10,-9),2)
       else
       round( primary_impact.cost_projection.cost.units +
       (primary_impact.cost_projection.cost.nanos * power(10,-9)), 2)
   end
   as dollar_amt,
   primary_impact.cost_projection.duration.seconds/(60*60*24) as duration_in_days
FROM `<project>.<dataset>.recommendations_export`
WHERE DATE(_PARTITIONTIME) = "<date>"
and primary_impact.category = "COST"

Viewing the list of unused IAM roles

SELECT *
FROM `<project>.<dataset>.recommendations_export`
WHERE DATE(_PARTITIONTIME) = "<date>"
and recommender = "google.iam.policy.Recommender"
and recommender_subtype = "REMOVE_ROLE"

Viewing a list of granted roles that must be replaced by smaller roles

SELECT *
FROM `<project>.<dataset>.recommendations_export`
WHERE DATE(_PARTITIONTIME) = "<date>"
and recommender = "google.iam.policy.Recommender"
and recommender_subtype = "REPLACE_ROLE"

Viewing insights for a recommendation

SELECT recommendations.name as recommendation_name,
insights.name as insight_name,
recommendations.cloud_entity_id,
recommendations.cloud_entity_type,
recommendations.recommender,
recommendations.recommender_subtype,
recommendations.description,
recommendations.target_resources,
recommendations.recommendation_details,
recommendations.state,
recommendations.last_refresh_time as recommendation_last_refresh_time,
insights.insight_type,
insights.insight_subtype,
insights.category,
insights.description,
insights.insight_details,
insights.state,
insights.last_refresh_time as insight_last_refresh_time
FROM `<project>.<dataset>.recommendations_export` as recommendations,
   `<project>.<dataset>.insights_export` as insights
WHERE DATE(recommendations._PARTITIONTIME) = "<date>"
and DATE(insights._PARTITIONTIME) = "<date>"
and insights.name in unnest(recommendations.associated_insights)

Viewing recommendations for projects belonging to a specific folder

This query returns parent folders up to five levels from the project.

SELECT *
FROM `<project>.<dataset>.recommendations_export`
WHERE DATE(_PARTITIONTIME) = "<date>"
and "<folder_id>" in unnest(ancestors.folder_ids)

Viewing recommendations for the latest available date exported so far

DECLARE max_date TIMESTAMP;

SET max_date = (
  SELECT MAX(_PARTITIONTIME) FROM
  `<project>.<dataset>.recommendations_export`
  );

SELECT *
FROM `<project>.<dataset>.recommendations_export`
WHERE _PARTITIONTIME = max_date

Use Sheets to explore BigQuery data

As an alternative to executing queries on BigQuery, you can access, analyze, visualize, and share billions of rows of BigQuery data from your spreadsheet with Connected Sheets, the new BigQuery data connector. For more information, refer to Get started with BigQuery data in Google Sheets.

Set up the Export Using BigQuery Command Line & REST API

  • Get required permissions:

    You can get the required Identity and Access Management permissions via the Google Cloud console or command line.

    For example, to use Command Line to get organization level recommender.resources.export permission for the service account:

    gcloud organizations add-iam-policy-binding *<organization_id>* --member=serviceAccount:*<service_acct_name>*' --role='roles/recommender.exporter'

  • Create dataset & enable BigQuery API

  • Enroll project in BigQuery data source

    Datasource to use: 6063d10f-0000-2c12-a706-f403045e6250

  • Create the export:

    bq mk \
    --transfer_config \
    --project_id=project_id \
    --target_dataset=dataset_id \
    --display_name=name \
    --params='parameters' \
    --data_source=data_source \
    --service_account_name=service_account_name
    

    Where:

    • project_id is your project ID.
    • dataset is the target dataset id for the transfer configuration.
    • name is the display name for the transfer configuration. The transfer name can be any value that allows you to easily identify the transfer if you need to modify it later.
    • parameters contains the parameters for the created transfer configuration in JSON format. For Recommendations and Insights BigQuery Export, you must supply the organization_id for which recommendations and insights need to be exported. Parameters format: '{"organization_id":"<org id>"}'
    • data_source Datasource to use: '6063d10f-0000-2c12-a706-f403045e6250'
    • service_account_name is the service account name used for authenticating your export. The service account should be owned by the same project_id used for creating the transfer and it should have all the required permissions listed above.
  • Manage an existing export via UI or BigQuery Command Line:

  • Note - the export runs as the user that set up the account, irrespective of who updates the export configuration in future. For example, if the export is set up using a service account, and later a human user updates the export configuration via the BigQuery Data Transfer Service UI, the export will continue to run as the service account. The permission check for 'recommender.resources.export' in this case is done for the service account every time the export runs.