The Storage Insights inventory report feature helps you manage your object
storage at scale. It is a faster and scheduled alternative to the
Objects: list
API operation.
Inventory reports contain metadata information about your objects, such as the object's storage class, ETag, and content type. This information helps you analyze your storage costs, audit and validate your objects, and ensure data security and compliance. You can export inventory reports as comma-separated value (CSV) or Apache Parquet files so you can further analyze it using tools such as BigQuery.
This page provides an overview of the Storage Insights inventory report feature. For instructions on how to use the feature, see Create and manage inventory reports.
Overview of inventory reports
Inventory reports contain a list of objects and their associated metadata for a given bucket, also known as the source bucket. To generate inventory reports, you must first create an inventory report configuration that defines how often the reports are generated, the metadata fields you want the reports to include, and a bucket to generate and store the reports, also known as the destination bucket.
When you create an inventory report configuration, it automatically gets assigned a universally unique identifier (UUID). This field is non-editable. However, you can edit the following fields of an inventory report configuration:
- The display name of the inventory report configuration
- The object metadata fields that are included in the inventory reports
- The destination bucket that stores the inventory reports
- The schedule that determines how often inventory reports are generated
- The file format in which inventory reports are generated (CSV or Apache Parquet)
When you delete an inventory report configuration, new inventory reports are no longer generated for the configuration, but existing inventory reports remain.
Source and destination buckets
The source bucket contains the objects for which you want to generate inventory reports. It also contains the inventory report configuration. You can have up to 100 inventory report configurations in a source bucket.
The destination bucket stores the generated inventory reports. The destination bucket must meet the following requirements:
- It must be in the same location as the source bucket.
- It must be in the same project as the source bucket.
- It can be the same as the source bucket.
When you first create an inventory report configuration, a service agent is automatically created on your behalf. In order to create inventory report configurations and write inventory reports to the destination bucket, both you and your service agent must have required IAM permissions. See required permissions for yourself and for your service agent.
Storage Insights uses the source and destination buckets' names to determine which buckets to use when running jobs. If you delete a source or destination bucket and later create a new bucket with the same name, Storage Insights will run jobs using the new bucket.
Object metadata fields
The following metadata fields can be included in an inventory report. Metadata fields marked as "Required" must be included in the inventory report.
Metadata field | Description | Notes |
---|---|---|
project | The ID of the project where the source bucket resides. | Required |
bucket | The name of the source bucket. | Required |
name | The name of the object. | Required |
location | The location of the source bucket. | Optional |
size | The size of the object. | Optional |
timeCreated | The creation time of the object in RFC 3339 format. | Optional |
timeDeleted | The deletion time of the object in RFC 3339 format. Returned if and only if this version of the object is no longer a live version, but remains in the bucket as a noncurrent version. | Optional |
updated | The modification time of the object metadata in RFC 3339 format. | Optional |
storageClass | The storage class of the object. | Optional |
etag | HTTP 1.1 Entity tag for the object. | Optional |
retentionExpirationTime | The earliest time that the object can be deleted, which depends on
any retention configuration
set for the object and any
retention
policy set for the bucket that contains the object. The value for
retentionExpriationTime is given in
RFC
3339 format. |
Optional |
crc32c | The CRC32C checksum, as described in RFC 4960 Appendix B, encoded using base64 in big-endian byte order. For more information about the CRC32C checksum, see Object metadata. | Optional |
md5Hash | The MD5 hash of the data, encoded using base64. This field is not present for composite objects. For more information about the MD5 hash, see Object metadata. | Optional |
generation | The content generation of this object. Used for object versioning. | Optional |
metageneration | The version of the metadata for this object at this generation. Used for preconditions and for detecting changes in metadata. A metageneration number is only meaningful in the context of a particular generation of a particular object. | Optional |
contentType | The Content-Type of the object data. If an object is stored without a Content-Type, it is served as application/octet-stream. | Optional |
contentEncoding | The Content-Encoding of the object data. | Optional |
timeStorageClassUpdated | The time at which the object's storage class was last changed. When the object is initially created, it is set to timeCreated. | Optional |
For more information about object metadata fields, see Object metadata.
Inventory report shards
When an inventory report contains more than 1,000,000 objects, Storage Insights generates one or more shard objects to compose the inventory report. When all the shards of an inventory report have been successfully generated, a manifest file is generated in the same destination bucket as the shards.
Inventory report manifest file
The presence of a manifest file indicates that all the shards composing an inventory report have been generated. The manifest file also provides the names of the inventory report shard objects.
The manifest file follows the naming convention
REPORT_CONFIG_UUID_TARGET_DATETIME_manifest.json
, where:
REPORT_CONFIG_UUID
is the auto-generated UUID of the inventory report configuration.TARGET_DATETIME
is the auto-generated UTC date and time at which an inventory report is generated.
An example manifest file name is
fc95c52f-157a-494f-af4a-d4a53a69ba66_2022-11-30T00:00_manifest.json
.
The manifest.json
file contains the following auto-populated fields:
{ "report_config": REPORT_CONFIG_FIELDS, "records_processed": NUMBER_OF_INCLUDED_OBJECTS, "snapshot_time": "SNAPSHOT_TIME, "target_datetime": "TARGET_DATETIME, "shard_count": SHARD_COUNT, "report_shards_file_names": [ SHARD_FILE_NAME ...] }
Where:
REPORT_CONFIG_FIELDS
includes fields contained within the ReportConfig resource.NUMBER_OF_INCLUDED_OBJECTS
is the number of objects included in the inventory report.SNAPSHOT_TIME
is the auto-generated UTC datetime at which the data snapshot occurs. All the data in an inventory report is captured at the snapshot.TARGET_DATETIME
is the auto-generated UTC datetime at which an inventory report is generated.SHARD_COUNT
is the total number of generated shards that compose the inventory report.SHARD_FILE_NAME
is the name of a shard that composes an inventory report.
An example manifest.json
file looks like the following:
{ "report_config": { "name": "projects/123456789098/locations/us/reportConfigs/fcec5187-afa6-48b0-938a-543d16493dc0", "createTime": "2023-06-08T08:07:53.397366139Z", "updateTime": "2023-06-08T08:07:53.552347723Z", "frequencyOptions": { "frequency": "DAILY", "startDate": { "year": 2023, "month": 6, "day": 9 } "endDate": { "year": 2023, "month": 6, "day": 23 } }, "csvOptions": { "recordSeparator": "\n", "delimiter": "," }, "objectMetadataReportOptions": { "metadataFields": [ "project", "bucket", "name", "location", "updated", "storageClass", ], "storageFilters": { "bucket": "my-test-bucket" }, "storageDestinationOptions": { "bucket": "example-bucket", "destinationPath": "folder/subfolder" } } }, "records_processed": 3993900, "snapshot_time" : "2023-06-06T00:07:27Z", "target_datetime": { "year": 2023, "month": 6, "day": 6 }, "shard_count": 4, "report_shards_file_names": [ "fc95c52f-157a-494f-af4a-d4a53a69ba66_2022-11-30T00:54_0.csv", "fc95c52f-157a-494f-af4a-d4a53a69ba66_2022-11-30T00:54_1.csv", ... ], }
Pricing and supported bucket locations
Storage Insights is subject to pricing, where every one million objects contained in an inventory report is charged at the pricing below, dependent upon storage location. For more information about storage locations, see Bucket locations.
Supported locations
Location | Region | Pricing |
---|---|---|
Asia | ||
Taiwan (asia-east1 ) |
$0.0025/one million objects | |
Hong Kong (asia-east2 ) |
$0.0028/one million objects | |
Tokyo (asia-northeast1 ) |
$0.0028/one million objects | |
Osaka (asia-northeast2 ) |
$0.0028/one million objects | |
Seoul (asia-northeast3 ) |
$0.0028/one million objects | |
Singapore (asia-southeast1 ) |
$0.0025/one million objects | |
Australia | ||
Sydney (australia-southeast1 ) |
$0.0028/one million objects | |
Melbourne (australia-southeast2 ) |
$0.0028/one million objects | |
Europe | ||
Warsaw (europe-central2 ) |
$0.0028/one million objects | |
Finland (europe-north1 ) |
$0.0025/one million objects | |
Madrid (europe-southwest1 ) |
$0.0028/one million objects | |
Belgium (europe-west1 ) |
$0.0025/one million objects | |
London (europe-west2 ) |
$0.0028/one million objects | |
Frankfurt (europe-west3 ) |
$0.0028/one million objects | |
Netherlands (europe-west4 ) |
$0.0025/one million objects | |
Zurich (europe-west6 ) |
$0.0031/one million objects | |
Milan (europe-west8 ) |
$0.0028/one million objects | |
Paris (europe-west9 ) |
$0.0028/one million objects | |
India | ||
Mumbai (asia-south1 ) |
$0.0028/one million objects | |
Delhi (asia-south2 ) |
$0.0028/one million objects | |
Indonesia | ||
Jakarta (asia-southeast2 ) |
$0.0028/one million objects | |
Middle East | ||
Tel Aviv (me-west1 ) |
$0.0026/one million objects | |
North America | ||
Montreal (northamerica-northeast1 ) |
$0.0028/one million objects | |
Toronto (northamerica-northeast2 ) |
$0.0028/one million objects | |
Iowa (us-central1 ) |
$0.0025/one million objects | |
South Carolina (us-east1 ) |
$0.0025/one million objects | |
Northern Virginia (us-east4 ) |
$0.0028/one million objects | |
Columbus (us-east5 ) |
$0.0025/one million objects | |
Oregon (us-west1 ) |
$0.0025/one million objects | |
Los Angeles (us-west2 ) |
$0.0028/one million objects | |
Salt Lake City (us-west3 ) |
$0.0028/one million objects | |
Las Vegas (us-west4 ) |
$0.0028/one million objects | |
Dallas (us-south1 ) |
$0.0025/one million objects | |
South America | ||
Sao Paulo (southamerica-east1 ) |
$0.0043/one million objects | |
Santiago (southamerica-west1 ) |
$0.0037/one million objects | |
Multi-regions | ||
Asia (asia ) |
$0.0028/one million objects | |
Europe (eu ) |
$0.0028/one million objects | |
United States (us ) |
$0.0028/one million objects | |
Dual-regions | ||
Tokyo/Osaka (asia1 ) |
$0.0028/one million objects | |
Finland/Netherlands (eur4 ) |
$0.0028/one million objects | |
Iowa/South Carolina (nam4 ) |
$0.0028/one million objects |
Audit logging
Cloud Storage creates audit logs when inventory reports are generated in the destination bucket. Storage Insights generates audit logs whenever inventory report configurations are created, updated, or deleted.
Cloud Storage does not create audit logs when an inventory report configuration reads object metadata from a source bucket.
Integration with VPC Service Controls
You can provide an additional layer of security for Storage Insights resources by using VPC Service Controls. When you use VPC Service Controls, you add projects to service perimeters that protect resources and services from requests that originate from outside of the perimeter. To learn more about VPC Service Controls and service perimeters, see Service perimeter details and configuration.
Limitation
Enabling IP filtering on Cloud Storage buckets restricts Storage Insights from being able to access the bucket, regardless of whether they use a service agent to interact with Cloud Storage. To prevent service disruptions, we recommend not using IP filtering on Cloud Storage buckets if you are creating inventory reports for that bucket.
What's next
Learn how to create an inventory report configuration and start generating inventory reports.