Use soft delete

Overview Usage

This page describes how to enable, disable, update, and check the status of the soft delete policy on a bucket. To learn how to list and restore the soft-deleted objects, see Use soft-deleted objects.

Required roles

To get the permissions that you need to create and manage soft delete policies, ask your administrator to grant you the Storage Admin (roles/storage.admin) IAM role on the bucket or the project that contains the bucket.

This predefined role contains the permissions required to create and manage soft delete policies. To see the exact permissions that are required, expand the Required permissions section:

Required permissions

The following permissions are required to create and manage soft delete policies:

  • storage.buckets.get
  • storage.buckets.update
  • storage.buckets.list (this permission is only required if you plan to use the Google Cloud console to perform the instructions on this page)

For information about granting roles, see Use IAM with buckets or Manage access to projects.

Edit a bucket's soft delete policy

To enable or change the soft delete policy for a bucket:

Console

  1. In the Google Cloud console, go to the Cloud Storage Buckets page.

    Go to Buckets

  2. In the list of buckets, click the name of the bucket whose soft delete policy you want to manage.

  3. Click the Protection tab.

  4. In the Soft delete policy section, perform one of the following actions:

    • If the bucket doesn't have a soft delete policy, click Edit, choose a unit of time and a length of time for your retention duration, and click Save.

    • If the bucket has a soft delete policy, click Edit to change the unit of time and length of time for your retention duration.

To learn how to get detailed error information about failed Cloud Storage operations in the Google Cloud console, see Troubleshooting.

Command line

To add or modify the soft delete policy on a bucket, use the gcloud storage buckets update command with the --soft-delete-duration flag:

  gcloud storage buckets update gs://BUCKET_NAME --soft-delete-duration=SOFT_DELETE_DURATION

Where:

  • BUCKET_NAME is the name of the bucket. For example, my-bucket.
  • SOFT_DELETE_DURATION specifies the duration to retain soft-deleted objects. For example, 2w1d is two weeks and one day. For more information, see soft delete retention duration.

REST APIs

JSON API

  1. Have gcloud CLI installed and initialized, in order to generate an access token for the Authorization header.

    Alternatively, you can create an access token using the OAuth 2.0 Playground and include it in the Authorization header.

  2. Create a JSON file that contains the following information:

    {
      "softDeletePolicy": {
        "retentionDurationSeconds": "TIME_IN_SECONDS"
      }
    }

    Where TIME_IN_SECONDS is the amount of time in seconds you want to retain soft-deleted objects for. For example, 2678400. For more information, see soft delete retention duration.

  3. Use cURL to call the JSON API with a PATCH Bucket request:

    curl -X PATCH --data-binary @JSON_FILE_NAME \
      -H "Authorization: Bearer $(gcloud auth print-access-token)" \
      -H "Content-Type: application/json" \
      "https://storage.googleapis.com/storage/v1/b/BUCKET_NAME"

    Where:

    • JSON_FILE_NAME is the path for the JSON file that you created in Step 2.
    • BUCKET_NAME is the name of the relevant bucket. For example, my-bucket.

Delete a bucket's soft delete policy

Console

  1. In the Google Cloud console, go to the Cloud Storage Buckets page.

    Go to Buckets

  2. In the list of buckets, click the name of the bucket whose soft delete policy you want to delete.

  3. Click the Protection tab.

  4. In the Soft delete policy section, click Disable to remove the soft delete policy for the bucket.

  5. Click Confirm.

You can also set the policy duration to 0 days to disable. You can also disable soft delete policies on multiple buckets in bulk

To learn how to get detailed error information about failed Cloud Storage operations in the Google Cloud console, see Troubleshooting.

Command line

To remove the soft delete policy from a bucket, use the gcloud storage buckets update command with the --clear-soft-delete flag:

  gcloud storage buckets update gs://BUCKET_NAME --clear-soft-delete

Where:

  • BUCKET_NAME is the name of the bucket. For example, my-bucket.

REST APIs

JSON API

  1. Have gcloud CLI installed and initialized, in order to generate an access token for the Authorization header.

    Alternatively, you can create an access token using the OAuth 2.0 Playground and include it in the Authorization header.

  2. Create a JSON file that contains the following information:

    {
      "softDeletePolicy": {
        "retentionDurationSeconds": "TIME_IN_SECONDS"
      }
    }

    To disable the soft delete policy for a bucket, use the value 0 for TIME_IN_SECONDS.

  3. Use cURL to call the JSON API with a PATCH Bucket request:

    curl -X PATCH --data-binary @JSON_FILE_NAME \
      -H "Authorization: Bearer $(gcloud auth print-access-token)" \
      -H "Content-Type: application/json" \
      "https://storage.googleapis.com/storage/v1/b/BUCKET_NAME"

    Where:

    • JSON_FILE_NAME is the path for the JSON file that you created in Step 2.
    • BUCKET_NAME is the name of the relevant bucket. For example, my-bucket.

Check if the soft delete policy is enabled on a bucket

Console

  1. In the Google Cloud console, go to the Cloud Storage Buckets page.

    Go to Buckets

  2. In the list of buckets, click the name of the bucket whose soft delete policy you want to check.

  3. Click the Protection tab.

    The status displays in the Soft delete policy (for data recovery) section.

You can also use the Protection tab to check if there's a soft delete policy on your bucket.

To learn how to get detailed error information about failed Cloud Storage operations in the Google Cloud console, see Troubleshooting.

Command line

To check the soft delete policy status of a bucket, use the gcloud storage buckets describe command:

  gcloud storage buckets describe gs://BUCKET_NAME \
      --format="default(soft_delete_policy)"

Where:

  • BUCKET_NAME is the name of the bucket. For example, my-bucket.

REST APIs

JSON API

  1. Have gcloud CLI installed and initialized, in order to generate an access token for the Authorization header.

    Alternatively, you can create an access token using the OAuth 2.0 Playground and include it in the Authorization header.

  2. Use cURL to call the JSON API with a GET Bucket request:

    curl -X GET \
      -H "Authorization: Bearer $(gcloud auth print-access-token)" \
      -H "Content-Type: application/json" \
      "https://storage.googleapis.com/storage/v1/b/BUCKET_NAME?fields=softDeletePolicy"

    Where BUCKET_NAME is the name of the relevant bucket. For example, my-bucket.

Disable soft delete on buckets with the most soft-deleted bytes

In the Google Cloud console, you can review soft delete-enabled buckets in your project that have the highest number of soft-deleted bytes in the last 30 days and bulk disable soft delete to help lower costs if needed.

Console

  1. In the Google Cloud console, go to the Cloud Storage Buckets page.

    Go to Buckets

  2. In the Cloud Storage page, click Settings.

  3. Click the Soft delete tab.

  4. From the Top buckets by deleted bytes list, select the buckets you want to disable soft delete for.

  5. Click Turn off soft delete.

    Cloud Storage turns off soft delete on the buckets you selected and removes the buckets from the list. Note that you can only disable soft delete on 100 buckets at a time when you use the Google Cloud console. To disable soft delete on more than 100 buckets at a time, use the Google Cloud SDK or Cloud Client Libraries for Python instructions.

Command line

Use the Google Cloud CLI command gcloud storage buckets update

gcloud storage buckets update --clear-soft-delete gs://*

Client libraries

Python

For more information, see the Cloud Storage Python API reference documentation.

To authenticate to Cloud Storage, set up Application Default Credentials. For more information, see Set up authentication for a local development environment.


import argparse
import json
from typing import Dict, List
import google.cloud.monitoring_v3 as monitoring_client


def get_relative_cost(storage_class: str) -> float:
    """Retrieves the relative cost for a given storage class and location.

    Args:
        storage_class: The storage class (e.g., 'standard', 'nearline').

    Returns:
        The price per GB from the https://cloud.google.com/storage/pricing,
        divided by the standard storage class.
    """
    relative_cost = {
        "STANDARD": 0.023 / 0.023,
        "NEARLINE": 0.013 / 0.023,
        "COLDLINE": 0.007 / 0.023,
        "ARCHIVE": 0.0025 / 0.023,
    }

    return relative_cost.get(storage_class, 1.0)


def get_soft_delete_cost(
    project_name: str,
    soft_delete_window: float,
    agg_days: int,
    lookback_days: int,
) -> Dict[str, List[Dict[str, float]]]:
    """Calculates soft delete costs for buckets in a Google Cloud project.

    Args:
        project_name: The name of the Google Cloud project.
        soft_delete_window: The time window in seconds for considering
          soft-deleted objects (default is 7 days).
        agg_days: Aggregate results over a time period, defaults to 30-day period
        lookback_days: Look back up to upto days, defaults to 360 days

    Returns:
        A dictionary with bucket names as keys and cost data for each bucket,
        broken down by storage class.
    """

    query_client = monitoring_client.QueryServiceClient()

    # Step 1: Get storage class ratios for each bucket.
    storage_ratios_by_bucket = get_storage_class_ratio(
        project_name, query_client, agg_days, lookback_days
    )

    # Step 2: Fetch soft-deleted bytes and calculate costs using Monitoring API.
    soft_deleted_costs = calculate_soft_delete_costs(
        project_name,
        query_client,
        soft_delete_window,
        storage_ratios_by_bucket,
        agg_days,
        lookback_days,
    )

    return soft_deleted_costs


def calculate_soft_delete_costs(
    project_name: str,
    query_client: monitoring_client.QueryServiceClient,
    soft_delete_window: float,
    storage_ratios_by_bucket: Dict[str, float],
    agg_days: int,
    lookback_days: int,
) -> Dict[str, List[Dict[str, float]]]:
    """Calculates the relative cost of enabling soft delete for each bucket in a
       project for certain time frame in secs.

    Args:
        project_name: The name of the Google Cloud project.
        query_client: A Monitoring API query client.
        soft_delete_window: The time window in seconds for considering
          soft-deleted objects (default is 7 days).
        storage_ratios_by_bucket: A dictionary of storage class ratios per bucket.
        agg_days: Aggregate results over a time period, defaults to 30-day period
        lookback_days: Look back up to upto days, defaults to 360 days

    Returns:
        A dictionary with bucket names as keys and a list of cost data
        dictionaries
        for each bucket, broken down by storage class.
    """
    soft_deleted_bytes_time = query_client.query_time_series(
        monitoring_client.QueryTimeSeriesRequest(
            name=f"projects/{project_name}",
            query=f"""
                    {{  # Fetch 1: Soft-deleted (bytes seconds)
                        fetch gcs_bucket :: storage.googleapis.com/storage/v2/deleted_bytes
                        | value val(0) * {soft_delete_window}\'s\'  # Multiply by soft delete window
                        | group_by [resource.bucket_name, metric.storage_class], window(), .sum;

                        # Fetch 2: Total byte-seconds (active objects)
                        fetch gcs_bucket :: storage.googleapis.com/storage/v2/total_byte_seconds 
                        | filter metric.type != 'soft-deleted-object'
                        | group_by [resource.bucket_name, metric.storage_class], window(1d), .mean  # Daily average
                        | group_by [resource.bucket_name, metric.storage_class], window(), .sum  # Total over window

                    }}  # End query definition
                    | every {agg_days}d  # Aggregate over larger time intervals
                    | within {lookback_days}d  # Limit data range for analysis
                    | ratio  # Calculate ratio (soft-deleted (bytes seconds)/ total (bytes seconds))
                    """,
        )
    )

    buckets: Dict[str, List[Dict[str, float]]] = {}
    missing_distribution_storage_class = []
    for data_point in soft_deleted_bytes_time.time_series_data:
        bucket_name = data_point.label_values[0].string_value
        storage_class = data_point.label_values[1].string_value
        # To include location-based cost analysis:
        # 1. Uncomment the line below:
        # location = data_point.label_values[2].string_value
        # 2. Update how you calculate 'relative_storage_class_cost' to factor in location
        soft_delete_ratio = data_point.point_data[0].values[0].double_value
        distribution_storage_class = bucket_name + " - " + storage_class
        storage_class_ratio = storage_ratios_by_bucket.get(
            distribution_storage_class
        )
        if storage_class_ratio is None:
            missing_distribution_storage_class.append(
                distribution_storage_class)
        buckets.setdefault(bucket_name, []).append({
            # Include storage class and location data for additional plotting dimensions.
            # "storage_class": storage_class,
            # 'location': location,
            "soft_delete_ratio": soft_delete_ratio,
            "storage_class_ratio": storage_class_ratio,
            "relative_storage_class_cost": get_relative_cost(storage_class),
        })

    if missing_distribution_storage_class:
        print(
            "Missing storage class for following buckets:",
            missing_distribution_storage_class,
        )
        raise ValueError("Cannot proceed with missing storage class ratios.")

    return buckets


def get_storage_class_ratio(
    project_name: str,
    query_client: monitoring_client.QueryServiceClient,
    agg_days: int,
    lookback_days: int,
) -> Dict[str, float]:
    """Calculates storage class ratios for each bucket in a project.

    This information helps determine the relative cost contribution of each
    storage class to the overall soft-delete cost.

    Args:
        project_name: The Google Cloud project name.
        query_client: Google Cloud's Monitoring Client's QueryServiceClient.
        agg_days: Aggregate results over a time period, defaults to 30-day period
        lookback_days: Look back up to upto days, defaults to 360 days

    Returns:
        Ratio of Storage classes within a bucket.
    """
    request = monitoring_client.QueryTimeSeriesRequest(
        name=f"projects/{project_name}",
        query=f"""
            {{
            # Fetch total byte-seconds for each bucket and storage class
            fetch gcs_bucket :: storage.googleapis.com/storage/v2/total_byte_seconds
            | group_by [resource.bucket_name, metric.storage_class], window(), .sum;
            # Fetch total byte-seconds for each bucket (regardless of class)
            fetch gcs_bucket :: storage.googleapis.com/storage/v2/total_byte_seconds
            | group_by [resource.bucket_name], window(), .sum
            }}
            | ratio  # Calculate ratios of storage class size to total size
            | every {agg_days}d
            | within {lookback_days}d
            """,
    )

    storage_class_ratio = query_client.query_time_series(request)

    storage_ratios_by_bucket = {}
    for time_series in storage_class_ratio.time_series_data:
        bucket_name = time_series.label_values[0].string_value
        storage_class = time_series.label_values[1].string_value
        ratio = time_series.point_data[0].values[0].double_value

        # Create a descriptive key for the dictionary
        key = f"{bucket_name} - {storage_class}"
        storage_ratios_by_bucket[key] = ratio

    return storage_ratios_by_bucket


def soft_delete_relative_cost_analyzer(
    project_name: str,
    cost_threshold: float = 0.0,
    soft_delete_window: float = 604800,
    agg_days: int = 30,
    lookback_days: int = 360,
    list_buckets: bool = False,
) -> str | Dict[str, float]:  # Note potential string output
    """Identifies buckets exceeding the relative cost threshold for enabling soft delete.

    Args:
        project_name: The Google Cloud project name.
        cost_threshold: Threshold above which to consider removing soft delete.
        soft_delete_window: Time window for calculating soft-delete costs (in
          seconds).
        agg_days: Aggregate results over this time period (in days).
        lookback_days: Look back up to this many days.
        list_buckets: Return a list of bucket names (True) or JSON (False,
          default).

    Returns:
        JSON formatted results of buckets exceeding the threshold and costs
        *or* a space-separated string of bucket names.
    """

    buckets: Dict[str, float] = {}
    for bucket_name, storage_sources in get_soft_delete_cost(
        project_name, soft_delete_window, agg_days, lookback_days
    ).items():
        bucket_cost = 0.0
        for storage_source in storage_sources:
            bucket_cost += (
                storage_source["soft_delete_ratio"]
                * storage_source["storage_class_ratio"]
                * storage_source["relative_storage_class_cost"]
            )
        if bucket_cost > cost_threshold:
            buckets[bucket_name] = round(bucket_cost, 4)

    if list_buckets:
        return " ".join(buckets.keys())  # Space-separated bucket names
    else:
        return json.dumps(buckets, indent=2)  # JSON output


def soft_delete_relative_cost_analyzer_main() -> None:
    # Sample run: python storage_soft_delete_relative_cost_analyzer.py <Project Name>
    parser = argparse.ArgumentParser(
        description="Analyze and manage Google Cloud Storage soft-delete costs."
    )
    parser.add_argument(
        "project_name", help="The name of the Google Cloud project to analyze."
    )
    parser.add_argument(
        "--cost_threshold",
        type=float,
        default=0.0,
        help="Relative Cost threshold.",
    )
    parser.add_argument(
        "--soft_delete_window",
        type=float,
        default=604800.0,
        help="Time window (in seconds) for considering soft-deleted objects.",
    )
    parser.add_argument(
        "--agg_days",
        type=int,
        default=30,
        help=(
            "Time window (in days) for aggregating results over a time period,"
            " defaults to 30-day period"
        ),
    )
    parser.add_argument(
        "--lookback_days",
        type=int,
        default=360,
        help=(
            "Time window (in days) for considering the how old the bucket to be."
        ),
    )
    parser.add_argument(
        "--list",
        type=bool,
        default=False,
        help="Return the list of bucketnames seperated by space.",
    )

    args = parser.parse_args()

    response = soft_delete_relative_cost_analyzer(
        args.project_name,
        args.cost_threshold,
        args.soft_delete_window,
        args.agg_days,
        args.lookback_days,
        args.list,
    )
    if not args.list:
        print(
            "To remove soft-delete policy from the listed buckets run:\n"
            # Capture output
            "python storage_soft_delete_relative_cost_analyzer.py"
            " [your-project-name] --[OTHER_OPTIONS] --list > list_of_buckets.txt \n"
            "cat list_of_buckets.txt | gcloud storage buckets update -I "
            "--clear-soft-delete",
            response,
        )
        return
    print(response)


if __name__ == "__main__":
    soft_delete_relative_cost_analyzer_main()

What's next