Storage Insights inventory reports

The Storage Insights inventory report feature helps you manage your object storage at scale. It is a faster and scheduled alternative to the Objects: list API operation.

Inventory reports contain metadata information about your objects, such as the object's storage class, ETag, and content type. This information helps you analyze your storage costs, audit and validate your objects, and ensure data security and compliance. You can export inventory reports as comma-separated value (CSV) or Apache Parquet files so you can further analyze it using tools such as BigQuery.

This page provides an overview of the Storage Insights inventory report feature. For instructions on how to use the feature, see Create and manage inventory reports.

Overview of inventory reports

Inventory reports contain a list of objects and their associated metadata for a given bucket, also known as the source bucket. To generate inventory reports, you must first create an inventory report configuration that defines how often the reports are generated, the metadata fields you want the reports to include, and a bucket to generate and store the reports, also known as the destination bucket.

When you create an inventory report configuration, it automatically gets assigned a universally unique identifier (UUID). This field is non-editable. However, you can edit the following fields of an inventory report configuration:

  • The display name of the inventory report configuration
  • The object metadata fields that are included in the inventory reports
  • The destination bucket that stores the inventory reports
  • The schedule that determines how often inventory reports are generated
  • The file format in which inventory reports are generated (CSV or Apache Parquet)

When you delete an inventory report configuration, new inventory reports are no longer generated for the configuration, but existing inventory reports remain.

Source and destination buckets

The source bucket contains the objects for which you want to generate inventory reports. It also contains the inventory report configuration. You can have up to 100 inventory report configurations in a source bucket.

The destination bucket stores the generated inventory reports. The destination bucket must meet the following requirements:

  • It must be in the same location as the source bucket.
  • It must be in the same project as the source bucket.
  • It can be the same as the source bucket.

When you first create an inventory report configuration, a service agent is automatically created on your behalf. In order to create inventory report configurations and write inventory reports to the destination bucket, both you and your service agent must have required IAM permissions. See required permissions for yourself and for your service agent.

Storage Insights uses the source and destination buckets' names to determine which buckets to use when running jobs. If you delete a source or destination bucket and later create a new bucket with the same name, Storage Insights will run jobs using the new bucket.

Object metadata fields

The following metadata fields can be included in an inventory report. Metadata fields marked as "Required" must be included in the inventory report.

Metadata field Description Notes
project The ID of the project where the source bucket resides. Required
bucket The name of the source bucket. Required
name The name of the object. Required
location The location of the source bucket. Optional
size The size of the object. Optional
timeCreated The creation time of the object in RFC 3339 format. Optional
timeDeleted The deletion time of the object in RFC 3339 format. Returned if and only if this version of the object is no longer a live version, but remains in the bucket as a noncurrent version. Optional
updated The modification time of the object metadata in RFC 3339 format. Optional
storageClass The storage class of the object. Optional
etag HTTP 1.1 Entity tag for the object. Optional
retentionExpirationTime The earliest time that the object can be deleted, which depends on any retention configuration set for the object and any retention policy set for the bucket that contains the object. The value for retentionExpriationTime is given in RFC 3339 format. Optional
crc32c The CRC32C checksum, as described in RFC 4960 Appendix B, encoded using base64 in big-endian byte order. For more information about the CRC32C checksum, see Object metadata. Optional
md5Hash The MD5 hash of the data, encoded using base64. This field is not present for composite objects. For more information about the MD5 hash, see Object metadata. Optional
generation The content generation of this object. Used for object versioning. Optional
metageneration The version of the metadata for this object at this generation. Used for preconditions and for detecting changes in metadata. A metageneration number is only meaningful in the context of a particular generation of a particular object. Optional
contentType The Content-Type of the object data. If an object is stored without a Content-Type, it is served as application/octet-stream. Optional
contentEncoding The Content-Encoding of the object data. Optional
timeStorageClassUpdated The time at which the object's storage class was last changed. When the object is initially created, it is set to timeCreated. Optional

For more information about object metadata fields, see Object metadata.

Inventory report shards

When an inventory report contains more than 1,000,000 objects, Storage Insights generates one or more shard objects to compose the inventory report. When all the shards of an inventory report have been successfully generated, a manifest file is generated in the same destination bucket as the shards.

Inventory report manifest file

The presence of a manifest file indicates that all the shards composing an inventory report have been generated. The manifest file also provides the names of the inventory report shard objects.

The manifest file follows the naming convention REPORT_CONFIG_UUID_TARGET_DATETIME_manifest.json, where:

  • REPORT_CONFIG_UUID is the auto-generated UUID of the inventory report configuration.

  • TARGET_DATETIME is the auto-generated UTC date and time at which an inventory report is generated.

An example manifest file name is fc95c52f-157a-494f-af4a-d4a53a69ba66_2022-11-30T00:00_manifest.json.

The manifest.json file contains the following auto-populated fields:

{
  "report_config": REPORT_CONFIG_FIELDS,
  "records_processed": NUMBER_OF_INCLUDED_OBJECTS,
  "snapshot_time": "SNAPSHOT_TIME,
  "target_datetime": "TARGET_DATETIME,
  "shard_count": SHARD_COUNT,
  "report_shards_file_names": [
    SHARD_FILE_NAME
    ...]
}

Where:

  • REPORT_CONFIG_FIELDS includes fields contained within the ReportConfig resource.

  • NUMBER_OF_INCLUDED_OBJECTS is the number of objects included in the inventory report.

  • SNAPSHOT_TIME is the auto-generated UTC datetime at which the data snapshot occurs. All the data in an inventory report is captured at the snapshot.

  • TARGET_DATETIME is the auto-generated UTC datetime at which an inventory report is generated.

  • SHARD_COUNT is the total number of generated shards that compose the inventory report.

  • SHARD_FILE_NAME is the name of a shard that composes an inventory report.

An example manifest.json file looks like the following:

{
  "report_config":
     {
       "name": "projects/123456789098/locations/us/reportConfigs/fcec5187-afa6-48b0-938a-543d16493dc0",
       "createTime": "2023-06-08T08:07:53.397366139Z",
       "updateTime": "2023-06-08T08:07:53.552347723Z",
       "frequencyOptions": {
         "frequency": "DAILY",
         "startDate": {
           "year": 2023,
           "month": 6,
           "day": 9
         }
         "endDate": {
           "year": 2023,
           "month": 6,
           "day": 23
         }
       },
       "csvOptions": {
         "recordSeparator": "\n",
         "delimiter": ","
       },
       "objectMetadataReportOptions": {
         "metadataFields": [
           "project",
           "bucket",
           "name",
           "location",
           "updated",
           "storageClass",
         ],
         "storageFilters": {
           "bucket": "my-test-bucket"
         },
         "storageDestinationOptions": {
           "bucket": "example-bucket",
           "destinationPath": "folder/subfolder"
         }
       }
     },
  "records_processed": 3993900,
  "snapshot_time" : "2023-06-06T00:07:27Z",
  "target_datetime": {
    "year": 2023,
    "month": 6,
    "day": 6
  },
  "shard_count": 4,
  "report_shards_file_names": [
    "fc95c52f-157a-494f-af4a-d4a53a69ba66_2022-11-30T00:54_0.csv",
    "fc95c52f-157a-494f-af4a-d4a53a69ba66_2022-11-30T00:54_1.csv",
    ...
  ],
}

Pricing and supported bucket locations

Storage Insights is subject to pricing, where every one million objects contained in an inventory report is charged at the pricing below, dependent upon storage location. For more information about storage locations, see Bucket locations.

Supported locations

Location Region Pricing
Asia
Taiwan (asia-east1) $0.0025/one million objects
Hong Kong (asia-east2) $0.0028/one million objects
Tokyo (asia-northeast1) $0.0028/one million objects
Osaka (asia-northeast2) $0.0028/one million objects
Seoul (asia-northeast3) $0.0028/one million objects
Singapore (asia-southeast1) $0.0025/one million objects
Australia
Sydney (australia-southeast1) $0.0028/one million objects
Melbourne (australia-southeast2) $0.0028/one million objects
Europe
Warsaw (europe-central2) $0.0028/one million objects
Finland (europe-north1) $0.0025/one million objects
Madrid (europe-southwest1) $0.0028/one million objects
Belgium (europe-west1) $0.0025/one million objects
London (europe-west2) $0.0028/one million objects
Frankfurt (europe-west3) $0.0028/one million objects
Netherlands (europe-west4) $0.0025/one million objects
Zurich (europe-west6) $0.0031/one million objects
Milan (europe-west8) $0.0028/one million objects
Paris (europe-west9) $0.0028/one million objects
India
Mumbai (asia-south1) $0.0028/one million objects
Delhi (asia-south2) $0.0028/one million objects
Indonesia
Jakarta (asia-southeast2) $0.0028/one million objects
Middle East
Tel Aviv (me-west1) $0.0026/one million objects
North America
Montreal (northamerica-northeast1) $0.0028/one million objects
Toronto (northamerica-northeast2) $0.0028/one million objects
Iowa (us-central1) $0.0025/one million objects
South Carolina (us-east1) $0.0025/one million objects
Northern Virginia (us-east4) $0.0028/one million objects
Columbus (us-east5) $0.0025/one million objects
Oregon (us-west1) $0.0025/one million objects
Los Angeles (us-west2) $0.0028/one million objects
Salt Lake City (us-west3) $0.0028/one million objects
Las Vegas (us-west4) $0.0028/one million objects
Dallas (us-south1) $0.0025/one million objects
South America
Sao Paulo (southamerica-east1) $0.0043/one million objects
Santiago (southamerica-west1) $0.0037/one million objects
Multi-regions
Asia (asia) $0.0028/one million objects
Europe (eu) $0.0028/one million objects
United States (us) $0.0028/one million objects
Dual-regions
Tokyo/Osaka (asia1) $0.0028/one million objects
Finland/Netherlands (eur4) $0.0028/one million objects
Iowa/South Carolina (nam4) $0.0028/one million objects

Audit logging

Cloud Storage creates audit logs when inventory reports are generated in the destination bucket. Storage Insights generates audit logs whenever inventory report configurations are created, updated, or deleted.

Cloud Storage does not create audit logs when an inventory report configuration reads object metadata from a source bucket.

Integration with VPC Service Controls

You can provide an additional layer of security for Storage Insights resources by using VPC Service Controls. When you use VPC Service Controls, you add projects to service perimeters that protect resources and services from requests that originate from outside of the perimeter. To learn more about VPC Service Controls and service perimeters, see Service perimeter details and configuration.

Limitation

Enabling IP filtering on Cloud Storage buckets restricts Storage Insights from being able to access the bucket, regardless of whether they use a service agent to interact with Cloud Storage. To prevent service disruptions, we recommend not using IP filtering on Cloud Storage buckets if you are creating inventory reports for that bucket.

What's next

Learn how to create an inventory report configuration and start generating inventory reports.