This page lists example SQL queries to analyze your assets data after you export them into BigQuery. See Standard SQL Query Syntax for more information.
Query directly on available columns
To find the quantity of each asset type, run the following query:
SELECT asset_type, COUNT(*) AS asset_count FROM `PROJECT_ID.DATASET_ID.TABLE_NAME` GROUP BY asset_type ORDER BY asset_count DESC
Query on repeated fields
To find Identity and Access Management (IAM) policies that grant access to Gmail accounts,
run the following query. BigQuery uses UNNEST
to flatten
repeated fields into a table
that you can query directly:
SELECT name, asset_type, bindings.role FROM `PROJECT_ID.DATASET_ID.TABLE_NAME` JOIN UNNEST(iam_policy.bindings) AS bindings JOIN UNNEST(bindings.members) AS principals WHERE principals like "%@gmail.com"
To find an organization, folder, or project that allows creating using a public IP, run the following query. This query is useful because allowing public IPs with Cloud SQL instances can introduce vulnerabilities unless SSL or a proxy is configured:
SELECT name FROM `PROJECT_ID.DATASET_ID.TABLE_NAME` JOIN UNNEST(org_policy) AS op WHERE op.constraint = "constraints/sql.restrictPublicIp" AND (op.boolean_policy IS NULL OR op.boolean_policy.enforced = FALSE);
To find an organization, folder, or project within the same VPC Service Controls service perimeter of a project, run the following query:
SELECT service_perimeter.title, service_perimeter.status.resources FROM `PROJECT_ID.DATASET_ID.TABLE_NAME` CROSS JOIN UNNEST(service_perimeter.status.resources) as resource WHERE resource = "projects/PROJECT_NUMBER";
Query on extracted JSON string
To find open firewall rules, run the following query. Learn more about JSON functions used in BigQuery.
CREATE TEMP FUNCTION json2array(json STRING) RETURNS ARRAY<STRING> LANGUAGE js AS """ return JSON.parse(json).map(x=>JSON.stringify(x)); """; SELECT firewall.name, firewall.resource.parent, JSON_EXTRACT(firewall.resource.data, '$.sourceRanges') AS sourceRanges FROM `PROJECT_ID.DATASET_ID.TABLE_NAME` AS firewall JOIN UNNEST(json2array(JSON_EXTRACT(firewall.resource.data, '$.sourceRanges'))) AS source_ranges WHERE asset_type="compute.googleapis.com/Firewall" AND JSON_EXTRACT(firewall.resource.data, '$.sourceRanges') IS NOT NULL AND JSON_EXTRACT_SCALAR(source_ranges, '$') = "0.0.0.0/0"
By separating tables per resource type, we can find open firewall rules with an easier and faster query.
SELECT firewall.name, firewall.resource.parent, sourceRanges FROM `PROJECT_ID.DATASET_ID.STRUCTURED_INSTANCE_TABLE_NAME` AS firewall JOIN UNNEST(firewall.resource.data.sourceRanges) AS sourceRanges WHERE sourceRanges = "0.0.0.0/0";
Query on joined resource types
To join tables of different resource types, run the following query. The
following example shows how to find all subnetworks that have no VM attached.
First, the query finds all subnetworks. Then, from that list, it selects those
subnetworks whose selfLinks
are not present.
SELECT name, JSON_EXTRACT(subnetwork.resource.data, '$.selfLink') AS selflink FROM `PROJECT_ID.DATASET_ID.TABLE_NAME` AS subnetwork WHERE asset_type = "compute.googleapis.com/Subnetwork" AND (JSON_EXTRACT(subnetwork.resource.data, '$.selfLink') NOT IN (SELECT DISTINCT JSON_EXTRACT(network_interfaces, '$.subnetwork') FROM `PROJECT_ID.DATASET_ID.TABLE_NAME` as instance JOIN UNNEST(json2array(JSON_EXTRACT(instance.resource.data, '$.networkInterfaces'))) AS network_interfaces WHERE asset_type ="compute.googleapis.com/Instance" AND JSON_EXTRACT(instance.resource.data, '$.networkInterfaces') IS NOT NULL )) IS NULL
By separating tables per resource type, we can find all subnetworks that have no VM attached with an easier and faster query.
SELECT name, subnetwork.resource.data.selfLink FROM `PROJECT_ID.DATASET_ID.STRUCTURED_SUBNETWORK_TABLE_NAME` AS subnetwork WHERE ( subnetwork.resource.data.selfLink NOT IN ( SELECT DISTINCT networkInterface.subnetwork FROM `PROJECT_ID.DATASET_ID.STRUCTURED_INSTANCE_TABLE_NAME` as instance JOIN UNNEST(instance.resource.data.networkInterfaces) AS networkInterface WHERE networkInterface IS NOT NULL ) ) IS NULL;
Query to find vulnerable Dataproc clusters due to CVE-2021-44228
CREATE TEMP FUNCTION vulnerable_version(imageVersion STRING) RETURNS BOOL LANGUAGE js AS r""" const version_regexp = /(?<major>\d+)(?:\.)(?<minor>\d+)(?:\.)?(?<sub>\d+)?/g; let match = version_regexp.exec(imageVersion); if(match.groups.major < 1){ return true; } if (match.groups.major == 1){ if (match.groups.minor < 3){ return true; } if(match.groups.minor == 3 && match.groups.sub < 95){ return true; } if(match.groups.minor == 4 && match.groups.sub < 77){ return true; } if(match.groups.minor == 5 && match.groups.sub < 53){ return true; } } if (match.groups.major == 2 && match.groups.minor == 0 && match.groups.sub < 27){ return true; } return false; """; SELECT c.name, c.resource.data.projectId AS project_id, c.resource.data.clusterName AS cluster_name, c.resource.data.config.softwareConfig.imageVersion AS image_version, c.resource.data.status.state AS cluster_state, vulnerable_version(c.resource.data.config.softwareConfig.imageVersion) AS is_vulnerable FROM `PROJECT_ID.DATASET_ID.TABLE_NAME_PREFIX_dataproc_googleapis_com_Cluster` c INNER JOIN `PROJECT_ID.DATASET_ID.TABLE_NAME_PREFIX_cloudresourcemanager_googleapis_com_Project` p ON p.resource.data.projectId = c.resource.data.projectId WHERE c.resource.data.config.softwareConfig.imageVersion IS NOT NULL AND c.resource.data.status.state = "RUNNING" AND p.resource.data.lifecycleState = "ACTIVE";