Monitor listings
As a data provider, you can track the usage metrics for you listings. There are two methods to get the usage metrics for your shared data:
Use Analytics Hub. With Analytics Hub you can view the usage metrics dashboard for your listings that includes daily subscriptions, daily executed jobs, number of subscribers for each organization, and jobs frequency for each table. The usage metrics for your shared data is retrieved by querying the
INFORMATION_SCHEMA.SHARED_DATASET_USAGE
view.Use the
INFORMATION_SCHEMA
view. You can track how subscribers use your datasets by directly querying theINFORMATION_SCHEMA.SHARED_DATASET_USAGE
view.
Use Analytics Hub
To get the usage metrics for your shared data by using Analytics Hub, follow these steps:
In the Google Cloud console, go to the Analytics Hub page.
Click the data exchange name that contains the listing for which you want to view the usage metrics.
Click Usage metrics, and then do the following:
From the Listings menu, select the listing.
Set the time range.
The page displays the following usage metrics:
- Total Subscriptions: the number of current subscriptions on the selected listing. You can view total subscriptions for up to 60 days.
- Total Subscribers: the number of unique subscribers across all subscriptions on the selected listing. You can view total subscribers for up to 60 days.
- Total jobs executed: the number of unique jobs run on each table of the selected listing.
- Total bytes scanned: the total number of bytes scanned from all tables of the selected listing.
- Daily Subscriptions: the chart that tracks the number of subscriptions for the selected listing over a time period. You can view daily subscriptions for up to 60 days.
- Subscribers per organization: lists the organizations and their number of subscribers that consume your selected listing.
- Daily Executed Jobs: this chart displays the jobs consumption from the selected listing.
- Tables' job frequency: the frequency at which the tables are accessed on the selected listing.
Use INFORMATION_SCHEMA
view
If you are a data provider, you can track how subscribers use your datasets by
querying the INFORMATION_SCHEMA.SHARED_DATASET_USAGE
view.
Ensure that you have the required role to query this view.
To run the query against a project other than your default project, add the project ID in the following format:
PROJECT_ID
.region-REGION_NAME
.INFORMATION_SCHEMA.SHARED_DATASET_USAGE
For example, myproject.region-us.INFORMATION_SCHEMA.SHARED_DATASET_USAGE
.
The following examples describe how to view the usage metrics by querying the
INFORMATION_SCHEMA
view:
Get the total number of jobs executed on all shared tables
The following example calculates total jobs run by subscribers for a project:
SELECT COUNT(DISTINCT job_id) AS num_jobs FROM `region-us`.INFORMATION_SCHEMA.SHARED_DATASET_USAGE
The result is similar to the following:
+------------+ | num_jobs | +------------+ | 1000 | +------------+
To check the total jobs run by subscribers, use the WHERE
clause:
- For datasets, use
WHERE dataset_id = "..."
. - For tables, use
WHERE dataset_id = "..." AND table_id = "..."
.
Get the most used table based on the number of rows processed
The following query calculates the most used table based on the number of rows processed by subscribers.
SELECT dataset_id, table_id, SUM(num_rows_processed) AS usage_rows FROM `region-us`.INFORMATION_SCHEMA.SHARED_DATASET_USAGE GROUP BY 1, 2 ORDER BY 3 DESC LIMIT 1
The output is similar to the following:
+---------------+-------------+----------------+ | dataset_id | table_id | usage_rows | +---------------+-------------+----------------+ | mydataset | mytable | 15 | +---------------+-------------+----------------+
Find the top organizations that consume your tables
The following query calculates the top subscribers based on the number of bytes
processed from your tables. You can also use the num_rows_processed
column as
a metric.
SELECT subscriber_org_number, ANY_VALUE(subscriber_org_display_name) AS subscriber_org_display_name, SUM(total_bytes_processed) AS usage_bytes FROM `region-us`.INFORMATION_SCHEMA.SHARED_DATASET_USAGE GROUP BY 1
The output is similar to the following:
+--------------------------+--------------------------------+----------------+ |subscriber_org_number | subscriber_org_display_name | usage_bytes | +-----------------------------------------------------------+----------------+ | 12345 | myorganization | 15 | +--------------------------+--------------------------------+----------------+
For subscribers without an organization, you can use job_project_number
instead of subscriber_org_number
.
Get usage metrics for your data exchange
If your data exchange and source dataset are in different projects, follow these step to view the usage metrics for your data exchange:
- Find all listings that belong to your data exchange.
- Retrieve the source dataset attached to the listing.
- To view the usage metrics for your data exchange, use the following query:
SELECT * FROM source_project_1.`region-us`.INFORMATION_SCHEMA.SHARED_DATASET_USAGE WHERE dataset_id='source_dataset_id' AND data_exchange_id="projects/4/locations/us/dataExchanges/x1" UNION ALL SELECT * FROM source_project_2.`region-us`.INFORMATION_SCHEMA.SHARED_DATASET_USAGE WHERE dataset_id='source_dataset_id' AND data_exchange_id="projects/4/locations/us/dataExchanges/x1"
What's next
- Learn how to manage Analytics Hub listings.
- Learn about BigQuery pricing.