This topic shows you how to export the asset metadata for your organization, folder, or project to a BigQuery table, and then run data analysis on your inventory. BigQuery provides a SQL-like experience for users to analyze data and produce meaningful insights without the use of custom scripts.
Before you begin
Before you begin, complete the following steps.
Enable the Cloud Asset Inventory API on the project where you'll be running the API commands.
Enable the Cloud Asset Inventory APIConfigure the permissions that are required to call the Cloud Asset Inventory API using either the gcloud CLI or the API.
Complete the following steps to set up your environment.
gcloud
To set up your environment to use the gcloud CLI to call the Cloud Asset Inventory API, install the Google Cloud CLI on your local client.
API
To set up your environment to call the Cloud Asset Inventory API with the
curl
command, complete the following steps.- Confirm that you have access to the
curl
command. Ensure that you grant your account one of the following roles on your project, folder, or organization.
- Cloud Asset Viewer role (
roles/cloudasset.viewer
) - Owner basic role (
roles/owner
)
- Cloud Asset Viewer role (
Note that the gcloud CLI uses the billing project as the consumer project. If you receive a permission denied message, you can check if the billing project is different from the core project:
gcloud config list
To set the billing project to the consumer project:
gcloud config set billing/quota_project CONSUMER_PROJECT_NUMBER
- Confirm that you have access to the
If you're exporting to a BigQuery dataset in a project that does not have the Cloud Asset Inventory API enabled, you must also grant the following roles to the
service-${CONSUMER_PROJECT_NUMBER}@gcp-sa-cloudasset.iam.gserviceaccount.com
service account in the destination project.- BigQuery Data Editor role (
roles/bigquery.dataEditor
) - BigQuery User role (
roles/bigquery.user
)
The service account will be created by calling the API once, or you can use the following commands to create the service account and grant the service agent role manually:
gcloud beta services identity create --service=cloudasset.googleapis.com --project=PROJECT_ID gcloud projects add-iam-policy-binding PROJECT_ID --member=serviceAccount:service-PROJECT_NUMBER@gcp-sa-cloudasset.iam.gserviceaccount.com --role=roles/cloudasset.serviceAgent
- BigQuery Data Editor role (
Limitations
BigQuery tables encrypted with custom Cloud Key Management Service (Cloud KMS) keys are not supported.
Appending the export output to an existing table is not supported unless you are exporting to a partitioned table. The destination table must be empty or you must overwrite it. To overwrite it, use the
--output-bigquery-force
flag with the gcloud CLI, or useforce
with the API.Google Kubernetes Engine (GKE) resource types, except for
container.googleapis.com/Cluster
andcontainer.googleapis.com/NodePool
, are not supported when exporting to separate tables per resource type.
Setting the BigQuery schema for the export
Every BigQuery table is defined by a schema that describes the column names, data types, and other information. Setting the content type during the export determines the schema for your table.
Resource or unspecified: When you set the content type to
RESOURCE
or do not specify it, and you set theper-asset-type
flag to false or do not use it, you create a BigQuery table that has the schema shown in figure 1.Resource.data
is the resource metadata represented as a JSON string.When you set the content type to
RESOURCE
or do not set the content type, and set theper-asset-type
flag totrue
, you create separate tables per asset type. The schema of each table include RECORD-type columns mapped to the nested fields in theResource.data
field of that asset type (up to the 15 nested levels that BigQuery supports). For per-type BigQuery example tables, see projects/export-assets-examples/datasets/structured_export.IAM policy: When you set the content type to
IAM_POLICY
in the REST API oriam-policy
in the gcloud CLI, you create a BigQuery table that has the schema shown in figure 2. Theiam_policy
RECORD
is fully expanded.Organization policy: When you set the content type to
ORG_POLICY
in the REST API ororg-policy
in the gcloud CLI, you create a BigQuery table that has the schema shown in figure 3.VPCSC policy: When you set content type to
ACCESS_POLICY
in the REST API oraccess-policy
in the gcloud CLI, you create a BigQuery table that has the schema shown in figure 4.OSConfig instance inventory: When you set content type to
OS_INVENTORY
in the REST API oros-inventory
in the gcloud CLI, you create a BigQuery table that has the schema shown in figure 5.Relationship: When you set content type to
RELATIONSHIP
in the REST API orrelationship
in the gcloud CLI, you create a BigQuery table that has the following schema shown in figure 6.
Separate tables per resource type
To export an asset snapshot at a given timestamp, complete the following steps.
gcloud
To export assets in a project, run the following command. This command stores the exported snapshot in a BigQuery table at BIGQUERY_TABLE.
gcloud asset export \ --content-type CONTENT_TYPE \ --project 'PROJECT_ID' \ --snapshot-time 'SNAPSHOT_TIME' \ --bigquery-table 'BIGQUERY_TABLE' \ --output-bigquery-force
Where:
- CONTENT_TYPE is the asset content type.
- PROJECT_ID is the ID of the project whose metadata is being exported. This project can be the one from which you're running the export or a different project.
- SNAPSHOT_TIME (Optional) is the time at which you want to take a snapshot of your assets. The value must be the current time or a time in the past. By default, a snapshot is taken at the current time. For information on time formats, see gcloud topic datetimes.
- BIGQUERY_TABLE is the table to which you're exporting your
metadata, in the format
projects/PROJECT_ID/datasets/DATASET_ID/tables/TABLE_NAME
. --output-bigquery-force
overwrites the destination table if it exists.
To export the assets of an organization or folder, you can use one of the
following flags in place of --project
.
--organization=ORGANIZATION_ID
--folder=FOLDER_ID
access-policy
can only be exported for an
--organization
.
API
To export the asset metadata in your project, run the following command. This command stores the exported snapshot in a BigQuery table named TABLE_NAME. Learn more about the exportAssets method.
gcurl -d '{"contentType":"CONTENT_TYPE", \ "outputConfig":{ \ "bigqueryDestination": { \ "dataset": "projects/PROJECT_ID/datasets/DATASET_ID",\ "table": "TABLE_NAME", \ "force": true \ } \ }}' \ https://cloudasset.googleapis.com/v1/projects/PROJECT_NUMBER:exportAssets
Exporting separate tables for each resource type
To export assets in a project to separate BigQuery tables for each
resource type, use the --per-asset-type
flag. Each table's name is
BIGQUERY_TABLE concatenated with _
(underscore) and
ASSET_TYPE_NAME. Non-alphanumeric characters are replaced with _
.
Note that GKE resource types, except
for container.googleapis.com/Cluster
and
container.googleapis.com/NodePool
, are not supported for this type of
export.
To export assets separate BigQuery tables for each resource type, run the following command.
gcloud
gcloud asset export \ --content-type CONTENT_TYPE \ --project 'PROJECT_ID' \ --snapshot-time 'SNAPSHOT_TIME' \ --bigquery-table 'BIGQUERY_TABLE' \ --output-bigquery-force \ --per-asset-type
Where:
- CONTENT_TYPE is the asset content type. This value also determines the schema for the export.
- PROJECT_ID is the ID of the project whose metadata is being exported. This project can be the one from which you're running the export, or a different project.
- SNAPSHOT_TIME (Optional) is the time at which you want to take a snapshot of your assets. The value must be the current time or a time in the past. By default, a snapshot is taken at the current time. See the gcloud topic datetimes for more information on valid time formats.
- BIGQUERY_TABLE is the table to which you're exporting your
metadata, in the format
projects/PROJECT_ID/datasets/DATASET_ID/tables/TABLE_NAME
. --output-bigquery-force
overwrites the destination table if it exists.--per-asset-type
exports to multiple BigQuery tables per resource type.
API
gcurl -d '{"contentType":"CONTENT_TYPE", \ "outputConfig":{ \ "bigqueryDestination": { \ "dataset": "projects/PROJECT_ID/datasets/DATASET_ID",\ "table": "TABLE_NAME", \ "force": true \ "separateTablesPerAssetType": true \ } \ }}' \ https://cloudasset.googleapis.com/v1/projects/PROJECT_NUMBER:exportAssets
Learn more about the exportAssets method.
If exporting to any table fails, the entire export operation fails and return the first error. Results of previous successful exports persist.
The following types are packed in a JSON string to overcome the compatibility issue between JSON3 and BigQuery types.
google.protobuf.Timestamp
google.protobuf.Duration
google.protobuf.FieldMask
google.protobuf.ListValue
google.protobuf.Value
google.protobuf.Struct
google.api.*
Exporting to a partitioned table
To export assets in a project to
partitioned tables,
use the --partition-key
flag. The exported snapshot is stored in a
BigQuery table at BIGQUERY_TABLE with daily
granularity and two additional timestamp columns, readTime
and
requestTime
, one of which is the partition-key
parameter.
To export assets in a project to partitioned tables, run the following command.
gcloud
gcloud asset export \ --content-type CONTENT_TYPE \ --project 'PROJECT_ID' \ --snapshot-time 'SNAPSHOT_TIME' \ --bigquery-table 'BIGQUERY_TABLE' \ --partition-key 'PARTITION_KEY' \ --output-bigquery-force \
Where:
- CONTENT_TYPE is the asset content type. This value also determines the schema for the export.
- PROJECT_ID is the ID of the project whose metadata exported. This project can be the one from which you're running the export or a different project.
- SNAPSHOT_TIME (Optional) is the time at which you want to take a snapshot of your assets. The value must be the current time or a time in the past. By default, a snapshot is taken at the current time. For information on time formats, see gcloud topic datetimes.
- BIGQUERY_TABLE is the table to which you're exporting your
metadata, in the format
projects/PROJECT_ID/datasets/DATASET_ID/tables/TABLE_NAME
. - PARTITION_KEY is the partition key column when exporting to BigQuery partitioned tables.
--output-bigquery-force
overwrites the destination table if it exists.
API
gcurl -d '{"contentType":"CONTENT_TYPE", \ "outputConfig":{ \ "bigqueryDestination": { \ "dataset": "projects/PROJECT_ID/datasets/DATASET_ID",\ "table": "TABLE_NAME", \ "force": true \ "partitionSpec": {"partitionKey": "PARTITION_KEY"} \ } \ }}' \ https://cloudasset.googleapis.com/v1/projects/PROJECT_NUMBER:exportAssets
Learn more about the exportAssets method.
In the case when a destination table already exists, the existing table's schema
is updated as necessary by appending additional columns. This schema update
fails if any columns change their type or mode, such as optional to repeated.
Then, if the output-bigquery-force
flag is set to TRUE
, the corresponding
partition is overwritten by the snapshot results, however data in one or
more different partitions remains intact. If output-bigquery-force
is
unset or FALSE
, it appends the data to the corresponding partition.
The export operation fails if the schema update or attempt to append data fails.
Checking the status of an export
To check the status of an export, run the following commands.
gcloud
To check the status of the export, you can run the following command. It is displayed in the gcloud CLI after running the export command.
gcloud asset operations describe OPERATION_ID
API
To view the status of your export, run the following command with the operation ID returned in the response to your export.
You can find the OPERATION_ID in the
name
field of the response to the export, which is formatted as follows:"name": "projects/PROJECT_NUMBER/operations/ExportAssets/CONTENT_TYPE/OPERATION_ID"
To check the status of your export, run following command with the OPERATION_ID:
gcurl https://cloudasset.googleapis.com/v1/projects/PROJECT_NUMBER/operations/ExportAssets/CONTENT_TYPE/OPERATION_ID
Viewing an asset snapshot
To view the table containing the asset snapshot metadata, complete the following steps.
Console
Go to the BigQuery page in the Google Cloud console.
Go to the BigQuery pageTo display the tables and views in the dataset, open the navigation panel. In the Resources section, select your project to expand it, and then select a dataset.
From the list, select your table.
Select Details and note the value in Number of rows. You may need this value to control the starting point for your results using the gcloud CLI or API.
To view a sample set of data, select Preview.
API
To browse your table's data, call
tabledata.list
.
In the tableId
parameter, specify the name of your table.
You can configure the following optional parameters to control the output.
maxResults
is the maximum number of results to return.selectedFields
is a comma-separated list of columns to return; If unspecified, all columns are returned.startIndex
is the zero-based index of the starting row to read.
Values are returned wrapped in a JSON object that you must parse, as described
in the tabledata.list
reference documentation.
The export lists the assets and their resource names.
Querying an asset snapshot
After you export your snapshot to BigQuery, you can run queries on your asset metadata. See Exporting to BigQuery Sample Queries to learn more about several typical use cases.
By default, BigQuery runs interactive, or on-demand, query jobs, which means that the query is executed as soon as possible. Interactive queries count towards your concurrent rate limit and your daily limit.
Query results are saved to either a temporary or permanent table. You can choose to append or overwrite data in an existing table or to create a new table, if none exists with the same name.
To run an interactive query that writes the output to a temporary table, complete the following steps.
Console
Go to the BigQuery page in the Google Cloud console.
Go to the BigQuery pageSelect
Compose new query.In the Query editor text area, enter a valid BigQuery SQL query.
(Optional) To change the data processing location, complete the following steps.
- Select More, and then select Query settings.
- Under Processing location, select Auto-select, and then choose your data's location.
- To update the query settings, select Save.
Select Run.
API
To start a new job, call the
jobs.insert
method. In the job resource, set the following parameters.In the
configuration
field, set thequery
field to a JobConfigurationQuery that describes the BigQuery query job.In the
jobReference
field, set thelocation
field appropriately for your job.
To poll for results, call
getQueryResults
. Poll untiljobComplete
equalstrue
. You can check for errors and warnings in theerrors
list.