This topic shows you how to export the asset metadata for your organization, folder, or project to a BigQuery table, and then run data analysis on your inventory. BigQuery provides a SQL-like experience for users to analyze data and produce meaningful insights without the use of custom scripts.
Before you begin
Before you begin, complete the following steps.
Enable the Cloud Asset Inventory API on the project where you'll be running the API commands.
Enable the Cloud Asset Inventory APIConfigure the permissions that are required to call the Cloud Asset Inventory API using either the
gcloud
tool or the API.Complete the following steps to set up your environment.
gcloud
To set up your environment to use the
gcloud
tool to call the Cloud Asset Inventory API, install the Cloud SDK on your local client.API
To set up your environment to call the Cloud Asset Inventory API with the Unix
curl
command, complete the following steps.- Install oauth2l on your local machine so you can interact with the Google OAuth system.
- Confirm that you have access to the Unix
curl
command. Ensure that you grant your account one of the following roles on your project, folder, or organization.
- Cloud Asset Viewer role (
roles/cloudasset.viewer
) - Owner basic role (
roles/owner
)
- Cloud Asset Viewer role (
If you're exporting to a BigQuery dataset in a project that does not have the Cloud Asset Inventory API enabled, you must also grant the following roles to the
service-${CONSUMER_PROJECT_NUMBER}@gcp-sa-cloudasset.iam.gserviceaccount.com
service account in the destination project.- BigQuery Data Editor role (
roles/bigquery.dataEditor
) - BigQuery User role (
roles/bigquery.user
)
The service account will be created by calling the API once, or you can use the following command:
gcloud beta services identity create --service=cloudasset.googleapis.com --project=PROJECT_ID
- BigQuery Data Editor role (
Exporting an asset snapshot
To export an asset snapshot at a given timestamp, complete the following steps.
gcloud
To export assets in a project, run the following command. This command stores the exported snapshot in a BigQuery table at BIGQUERY_TABLE.
gcloud asset export \ --content-type CONTENT_TYPE \ --project 'PROJECT_ID' \ --snapshot-time 'SNAPSHOT_TIME' \ --bigquery-table 'BIGQUERY_TABLE' \ --output-bigquery-force
Where:
- CONTENT_TYPE is the asset content type.
- PROJECT_ID is the ID of the project whose metadata is being exported. This project can be the one from which you're running the export or a different project.
- SNAPSHOT_TIME (Optional) is the time at which you want to take a snapshot of your assets. The value must be the current time or a time in the past. By default, a snapshot is taken at the current time. For information on time formats, see gcloud topic datetimes.
- BIGQUERY_TABLE is the table to which you're exporting your
metadata, in the format
projects/PROJECT_ID/datasets/DATASET_ID/tables/TABLE_NAME
. --output-bigquery-force
overwrites the destination table if it exists.
To export the assets of an organization or folder, you can use one of the
following flags in place of --project
.
--organization=ORGANIZATION_ID
--folder=FOLDER_ID
access-policy
can only be exported for an
--organization
.
API
To export the asset metadata in your project, run the following command. This command stores the exported snapshot in a BigQuery table named TABLE_NAME. Learn more about the exportAssets method.
gcurl -d '{"contentType":"CONTENT_TYPE", \ "outputConfig":{ \ "bigqueryDestination": { \ "dataset": "projects/PROJECT_ID/datasets/DATASET_ID",\ "table": "TABLE_NAME", \ "force": true \ } \ }}' \ https://cloudasset.googleapis.com/v1/projects/PROJECT_NUMBER:exportAssets
Setting the content type
Every BigQuery table is defined by a schema that describes the column names, data types, and other information. Setting the content type during the export determines the schema for your table.
Resource or unspecified: When you set the content type to
RESOURCE
or do not set the content type, you create a BigQuery table that has the schema shown in figure 1.Resource.data
is the resource metadata represented as a JSON string.
IAM policy: When you set the content type to
IAM_POLICY
in the REST API oriam-policy
in thegcloud
tool, you create a BigQuery table that has the schema shown in figure 2. Theiam_policy
RECORD
is fully expanded.Organization policy: When you set the content type to
ORG_POLICY
in the REST API ororg-policy
in thegcloud
tool, you create a BigQuery table that has the schema shown in figure 3.VPCSC policy: When you set content type to
ACCESS_POLICY
in the REST API oraccess-policy
in thegcloud
tool, you create a BigQuery table that has the schema shown in figure 4.OSConfig instance inventory: When you set content type to
OS_INVENTORY
in the REST API oros-inventory
in thegcloud
tool, you create a BigQuery table that has the schema shown in figure 5.
Separate tables per resource type
To export an asset snapshot at a given timestamp per resource type, complete the following steps:
gcloud
To export assets in a project per resource type, run the following command.
This command stores the exported snapshot in zero tables if the snapshot
results are empty, or in multiple BigQuery tables. Each table
contains the results of one asset type, and has BIGQUERY_TABLE
concatenated with _
(underscore) and the asset type name. Non-alphanumeric
characters are replaced with _
.
gcloud asset export \ --content-type CONTENT_TYPE \ --project 'PROJECT_ID' \ --snapshot-time 'SNAPSHOT_TIME' \ --bigquery-table 'BIGQUERY_TABLE' \ --output-bigquery-force \ --per-asset-type
Where:
- CONTENT_TYPE is the asset content type.
- PROJECT_ID is the ID of the project whose metadata is being exported. This project can be the one from which you're running the export, or a different project.
- SNAPSHOT_TIME (Optional) is the time at which you want to take a snapshot of your assets. The value must be the current time or a time in the past. By default, a snapshot is taken at the current time. See the gcloud topic datetimes for more information on valid time formats.
- BIGQUERY_TABLE is the table to which you're exporting your
metadata, in the format
projects/PROJECT_ID/datasets/DATASET_ID/tables/TABLE_NAME
. --output-bigquery-force
overwrites the destination table if it exists.--per-asset-type
exports to multiple BigQuery tables per resource type.
API
To export assets in a project per resource type, run the following command.
This command stores the exported snapshot in zero tables if the snapshot
results are empty, or multiple BigQuery tables. Each table contains
the results of one asset type, and has BIGQUERY_TABLE concatenated
with _
(underscore) and the asset type name. Non-alphanumeric characters
are replaced with _
. See the
exportAssets method
for more information.
gcurl -d '{"contentType":"CONTENT_TYPE", \ "outputConfig":{ \ "bigqueryDestination": { \ "dataset": "projects/PROJECT_ID/datasets/DATASET_ID",\ "table": "TABLE_NAME", \ "force": true \ "separateTablesPerAssetType": true \ } \ }}' \ https://cloudasset.googleapis.com/v1/projects/PROJECT_NUMBER:exportAssets
If exporting to any table fails, the entire export operation will fail and return the first error. But the results of succeeded exports will persist.
Setting the content type during the export determines the schema for each of the table.
Resource: When you set the content type to
RESOURCE
, the schema of each table will include RECORD-type columns mapped to the nested fields in theResource.data
field of that asset type (up to the 15 nested level BigQuery supports). Please see Per-type bigquery schemas of example tables in projects/export-assets-examples/datasets/structured_export.IAM policy, Organization policy, VPCSC policy or unspecified: When you set the content type to
IAM_POLICY
,ORG_POLICY
,ACCESS_POLICY
or do not set the content type, each table has the same schema as you set the per-asset-type to False, please see the schemas in the Setting the content type section for more details.
The following types are packed in a JSON string to overcome the compatibility issue between JSON3 and BigQuery types.
google.protobuf.Timestamp
google.protobuf.Duration
google.protobuf.FieldMask
google.protobuf.ListValue
google.protobuf.Value
google.protobuf.Struct
google.api.*
Export to partitioned tables
To export an asset snapshot at a given timestamp in a partitioned table, complete the following steps.
gcloud
To export assets in a project in partitioned table(s), run the following
command. This command stores the exported snapshot in a BigQuery table at
BIGQUERY_TABLE with daily granularity
and two additional timestamp columns, readTime
and requestTime
, one of which will be the partition key according to your partition-key
parameter.
gcloud asset export \ --content-type CONTENT_TYPE \ --project 'PROJECT_ID' \ --snapshot-time 'SNAPSHOT_TIME' \ --bigquery-table 'BIGQUERY_TABLE' \ --partition-key 'PARTITION_KEY' \ --output-bigquery-force \
Where:
- CONTENT_TYPE is the asset content type.
- PROJECT_ID is the ID of the project whose metadata exported. This project can be the one from which you're running the export or a different project.
- SNAPSHOT_TIME (Optional) is the time at which you want to take a snapshot of your assets. The value must be the current time or a time in the past. By default, a snapshot is taken at the current time. For information on time formats, see gcloud topic datetimes.
- BIGQUERY_TABLE is the table to which you're exporting your
metadata, in the format
projects/PROJECT_ID/datasets/DATASET_ID/tables/TABLE_NAME
. - PARTITION_KEY is the partition key column when exporting to BigQuery partitioned tables.
--output-bigquery-force
overwrites the destination table if it exists.
API
To export assets in a project in partitioned table(s), run the following
command. This command stores the exported snapshot in a BigQuery table at
BIGQUERY_TABLE with daily granularity
and two additional timestamp columns, readTime
and requestTime
, one of which will be the partition key according to your partition-key
parameter. Learn more about the
exportAssets method.
gcurl -d '{"contentType":"CONTENT_TYPE", \ "outputConfig":{ \ "bigqueryDestination": { \ "dataset": "projects/PROJECT_ID/datasets/DATASET_ID",\ "table": "TABLE_NAME", \ "force": true \ "partitionSpec": {"partitionKey": "PARTITION_KEY"} \ } \ }}' \ https://cloudasset.googleapis.com/v1/projects/PROJECT_NUMBER:exportAssets
In the case when a destination table already exists, the existing table's schema
will be updated as necessary by appending additional columns. This schema update
will fail if any columns change their type or mode, such as optional to repeated.
Then, if the output-bigquery-force
flag is set to TRUE
, the corresponding
partition will be overwritten by the snapshot results, however data in one or
more different partitions will remain intact. If output-bigquery-force
is
unset or FALSE
, it will append the data to the corresponding partition.
The export operation will fail if the schema update or attempt to append data fails.
Checking the status of an export
To check the status of an export, run the following commands.
gcloud
To check the status of the export, you can run the following command. It is
displayed in the gcloud
tool after running the export command.
gcloud asset operations describe OPERATION_ID
API
To view the status of your export, run the following command with the operation ID returned in the response to your export.
You can find the OPERATION_ID in the
name
field of the response to the export, which is formatted as follows:"name": "projects/PROJECT_NUMBER/operations/ExportAssets/CONTENT_TYPE/OPERATION_ID"
To check the status of your export, run following command with the OPERATION_ID:
gcurl https://cloudasset.googleapis.com/v1/projects/PROJECT_NUMBER/operations/ExportAssets/CONTENT_TYPE/OPERATION_ID
Viewing an asset snapshot
To view the table containing the asset snapshot metadata, complete the following steps.
Console
Go to the BigQuery page in the Cloud Console.
Go to the BigQuery pageTo display the tables and views in the dataset, open the navigation panel. In the Resources section, select your project to expand it, and then select a dataset.
From the list, select your table.
Select Details and note the value in Number of rows. You may need this value to control the starting point for your results using the
gcloud
tool or API.To view a sample set of data, select Preview.
API
To browse your table's data, call
tabledata.list
.
In the tableId
parameter, specify the name of your table.
You can configure the following optional parameters to control the output.
maxResults
is the maximum number of results to return.selectedFields
is a comma-separated list of columns to return; If unspecified, all columns are returned.startIndex
is the zero-based index of the starting row to read.
Values are returned wrapped in a JSON object that you must parse, as described
in the tabledata.list
reference documentation.
The export lists the assets and their resource names.
Querying an asset snapshot
After you export your snapshot to BigQuery, you can run queries on your asset metadata. See Exporting to BigQuery Sample Queries to learn more about several typical use cases.
By default, BigQuery runs interactive, or on-demand, query jobs, which means that the query is executed as soon as possible. Interactive queries count towards your concurrent rate limit and your daily limit.
Query results are saved to either a temporary or permanent table. You can choose to append or overwrite data in an existing table or to create a new table, if none exists with the same name.
To run an interactive query that writes the output to a temporary table, complete the following steps.
Console
Go to the BigQuery page in the Cloud Console.
Go to the BigQuery pageSelect
Compose new query.In the Query editor text area, enter a valid BigQuery SQL query.
(Optional) To change the data processing location, complete the following steps.
- Select More, and then select Query settings.
- Under Processing location, select Auto-select, and then choose your data's location.
- To update the query settings, select Save.
Select Run.
API
To start a new job, call the
jobs.insert
method. In the job resource, set the following parameters.In the
configuration
field, set thequery
field to a JobConfigurationQuery that describes the BigQuery query job.In the
jobReference
field, set thelocation
field appropriately for your job.
To poll for results, call
getQueryResults
. Poll untiljobComplete
equalstrue
. You can check for errors and warnings in theerrors
list.