Inventory reports

Inventory reports help you manage your object storage at scale. It is a faster and scheduled alternative to the Objects: list API operation. Use inventory reports if you want to validate the migration of large buckets without any performance impact or check the integrity of objects in a single bucket.

Inventory reports contain metadata information about your objects, such as the object's storage class, ETag, and content type. This information helps you analyze your storage costs, audit and validate your objects, and ensure data security and compliance. You can export inventory reports as comma-separated value (CSV) or Apache Parquet files so you can further analyze it using tools such as BigQuery.

This page provides an overview of inventory reports. For instructions on how to use inventory reports, see Create and manage inventory reports.

Overview of inventory reports

Inventory reports contain a list of objects and their associated metadata for a given bucket, also known as the source bucket. To generate inventory reports, you must first create an inventory report configuration that defines how often the reports are generated, the metadata fields you want the reports to include, and a bucket to generate and store the reports, also known as the destination bucket.

When you create an inventory report configuration, it automatically gets assigned a universally unique identifier (UUID). This field is non-editable. However, you can edit the following fields of an inventory report configuration:

The display name of the inventory report configuration
The object metadata fields that are included in the inventory reports
The destination bucket that stores the inventory reports
The schedule that determines how often inventory reports are generated
The file format in which inventory reports are generated (CSV or Apache Parquet)

When you delete an inventory report configuration, new inventory reports are no longer generated for the configuration, but existing inventory reports remain.

When to use inventory reports

Inventory reports are designed for quick analysis of individual buckets. You can use them to do the following:

List all objects within a bucket
Validate the success of data transfers
Generate audit reports for a particular bucket

When not to use inventory reports

Manually collating and analyzing inventory reports from many buckets and projects can be challenging, especially for large-scale analysis. For use cases such as organization-wide visibility, security analysis, or cost management, you can use Storage Insights datasets. Storage Insights datasets provide flexibility to configure a custom scope either at the bucket, folder, project, or organization level. Storage Insights datasets also provide insights, like custom metadata and soft delete information.

For Storage Insights datasets, data is refreshed daily, and you can analyze it using SQL in BigQuery or natural language questions with Gemini.

Use Storage Insights datasets if your goals are as follows:

Cross-organizational data discovery
Analysis for cost optimization and lifecycle management
Governance and security auditing
Time-series analysis to identify trends

Storage Insights datasets are an exclusive feature only available through the Storage Intelligence subscription.

Source and destination buckets

The source bucket contains the objects for which you want to generate inventory reports. It also contains the inventory report configuration. You can have up to 100 inventory report configurations in a source bucket.

The destination bucket stores the generated inventory reports. The destination bucket must meet the following requirements:

It must be in the same location as the source bucket.
It must be in the same project as the source bucket.
It can be the same as the source bucket.

When you first create an inventory report configuration, a service agent is automatically created on your behalf. In order to create inventory report configurations and write inventory reports to the destination bucket, both you and your service agent must have required IAM permissions. See required permissions for yourself and for your service agent.

Inventory reports use the source and destination buckets' names to determine which buckets to use when running jobs. If you delete a source or destination bucket and later create a new bucket with the same name, inventory reports will run jobs using the new bucket.

Object metadata fields

The following metadata fields can be included in an inventory report. Metadata fields marked as "Required" must be included in the inventory report.

Metadata field	Description	Notes
project	The ID of the project where the source bucket resides.	Required
bucket	The name of the source bucket.	Required
name	The name of the object.	Required
location	The location of the source bucket.	Optional
size	The size of the object.	Optional
timeCreated	The creation time of the object in RFC 3339 format.	Optional
timeDeleted	The deletion time of the object in RFC 3339 format. Returned if and only if this version of the object is no longer a live version, but remains in the bucket as a noncurrent version.	Optional
updated	The modification time of the object metadata in RFC 3339 format.	Optional
storageClass	The storage class of the object.	Optional
etag	HTTP 1.1 Entity tag for the object.	Optional
retentionExpirationTime	The earliest time that the object can be deleted, which depends on any retention configuration set for the object and any retention policy set for the bucket that contains the object. The value for `retentionExpirationTime` is given in RFC 3339 format.	Optional
crc32c	The CRC32C checksum, as described in RFC 4960 Appendix B, encoded using base64 in big-endian byte order. For more information about the CRC32C checksum, see Object metadata.	Optional
md5Hash	The MD5 hash of the data, encoded using base64. This field is not present for composite objects. For more information about the MD5 hash, see Object metadata.	Optional
generation	The content generation of this object. Used for object versioning.	Optional
metageneration	The version of the metadata for this object at this generation. Used for preconditions and for detecting changes in metadata. A metageneration number is only meaningful in the context of a particular generation of a particular object.	Optional
contentType	The Content-Type of the object data. If an object is stored without a Content-Type, it is served as application/octet-stream.	Optional
contentEncoding	The Content-Encoding of the object data.	Optional
timeStorageClassUpdated	The time at which the object's storage class was last changed. When the object is initially created, it is set to timeCreated.	Optional

For more information about object metadata fields, see Object metadata.

Inventory report shards

When an inventory report contains more than 1,000,000 objects, one or more shard objects are generated to compose the inventory report. When all the shards of an inventory report have been successfully generated, a manifest file is generated in the same destination bucket as the shards.

Inventory report manifest file

The presence of a manifest file indicates that all the shards composing an inventory report have been generated. The manifest file also provides the names of the inventory report shard objects.

The manifest file follows the naming convention REPORT_CONFIG_UUID_TARGET_DATETIME_manifest.json, where:

REPORT_CONFIG_UUID is the auto-generated UUID of the inventory report configuration.
TARGET_DATETIME is the auto-generated UTC date and time at which an inventory report is generated.

An example manifest filename is fc95c52f-157a-494f-af4a-d4a53a69ba66_2022-11-30T00:00_manifest.json.

The manifest.json file contains the following auto-populated fields:

{
  "report_config": REPORT_CONFIG_FIELDS,
  "records_processed": NUMBER_OF_INCLUDED_OBJECTS,
  "snapshot_time": "SNAPSHOT_TIME,
  "target_datetime": "TARGET_DATETIME,
  "shard_count": SHARD_COUNT,
  "report_shards_file_names": [
    SHARD_FILE_NAME
    ...]
}

Where:

REPORT_CONFIG_FIELDS includes fields contained within the ReportConfig resource.
NUMBER_OF_INCLUDED_OBJECTS is the number of objects included in the inventory report.
SNAPSHOT_TIME is the auto-generated UTC datetime at which the data snapshot occurs. All the data in an inventory report is captured at the snapshot.
TARGET_DATETIME is the auto-generated UTC datetime at which an inventory report is generated.
SHARD_COUNT is the total number of generated shards that compose the inventory report.
SHARD_FILE_NAME is the name of a shard that composes an inventory report.

An example manifest.json file looks like the following:

{
  "report_config":
     {
       "name": "projects/123456789098/locations/us/reportConfigs/fcec5187-afa6-48b0-938a-543d16493dc0",
       "createTime": "2023-06-08T08:07:53.397366139Z",
       "updateTime": "2023-06-08T08:07:53.552347723Z",
       "frequencyOptions": {
         "frequency": "DAILY",
         "startDate": {
           "year": 2023,
           "month": 6,
           "day": 9
         }
         "endDate": {
           "year": 2023,
           "month": 6,
           "day": 23
         }
       },
       "csvOptions": {
         "recordSeparator": "\n",
         "delimiter": ","
       },
       "objectMetadataReportOptions": {
         "metadataFields": [
           "project",
           "bucket",
           "name",
           "location",
           "updated",
           "storageClass",
         ],
         "storageFilters": {
           "bucket": "my-test-bucket"
         },
         "storageDestinationOptions": {
           "bucket": "example-bucket",
           "destinationPath": "folder/subfolder"
         }
       }
     },
  "records_processed": 3993900,
  "snapshot_time" : "2023-06-06T00:07:27Z",
  "target_datetime": {
    "year": 2023,
    "month": 6,
    "day": 6
  },
  "shard_count": 4,
  "report_shards_file_names": [
    "fc95c52f-157a-494f-af4a-d4a53a69ba66_2022-11-30T00:54_0.csv",
    "fc95c52f-157a-494f-af4a-d4a53a69ba66_2022-11-30T00:54_1.csv",
    ...
  ],
}

Pricing

Using inventory reports is subject to pricing, which is dependent on the bucket's location. For more information about inventory reports pricing, see Pricing.

Audit logging

Cloud Storage creates audit logs when inventory reports are generated in the destination bucket. Cloud Storage generates audit logs whenever inventory report configurations are created, updated, or deleted.

Cloud Storage does not create audit logs when an inventory report configuration reads object metadata from a source bucket.

Integration with VPC Service Controls

You can provide an additional layer of security for inventory reports resources by using VPC Service Controls. When you use VPC Service Controls, you add projects to service perimeters that protect resources and services from requests that originate from outside of the perimeter. To learn more about VPC Service Controls and service perimeters, see Service perimeter details and configuration.

Limitation

Enabling IP filtering on Cloud Storage buckets restricts inventory reports from being able to access the bucket, regardless of whether they use a service agent to interact with Cloud Storage. To prevent service disruptions, we recommend not using IP filtering on Cloud Storage buckets if you are creating inventory reports for that bucket.

What's next

Learn how to create an inventory report configuration and start generating inventory reports.