Filter by prefix

This page shows you how to include and exclude paths from transfers using include and exclude prefixes.

To learn how to create a manifest of specific objects to transfer, see Transfer specific files or objects using a manifest.

Overview

Storage Transfer Service supports the use of prefixes to select which files to include or exclude from the data source. You can use include prefixes, exclude prefixes, or both together.

Filtering by prefix is supported for Amazon S3, Microsoft Azure Blob Storage and Cloud Storage data sources.

  • Do not include the leading slash in a prefix. For example, to include the requests.gz object in a transfer from the following bucket path s3://my-aws-bucket/logs/y=2015/requests.gz, specify the include prefix as logs/y=2015/requests.gz.

  • Partial matches are supported for include and exclude prefixes. For example, path matches path_1/ and path_2/.

  • Wildcards are not supported.

  • If you specified a folder as your source location, prefix filters are relative to that folder. For example, if your source is gs://my-test-bucket/path/, an include filter of file includes all files starting with gs://my-test-bucket/path/file.

  • Each include prefix must include a distinct portion of the object namespace. No include prefix may be a prefix of another include prefix. For example, you may not specify both path_1 and path_1/subpath_2 as include prefixes.

  • If you use include prefixes and exclude prefixes together, then exclude prefixes must start with the value of one of the include prefixes. For example, if you specify a as an include prefix, valid exclude prefixes are a/b, aaa, and abc.

  • If you use just exclude prefixes, there are no restrictions on the prefixes you can use.

  • If you do not specify any prefixes, then all objects in the bucket are transferred.

For more general information about prefixes, see Listing Keys Hierarchically Using a Prefix and Delimiter in the Amazon S3 documentation or the Objects list method for Cloud Storage.

How to specify prefixes

Cloud console

To specify include and exclude prefixes using the Cloud console, enter the values when creating a new transfer or when updating an existing transfer.

gcloud CLI

To specify include and exclude prefixes using the gcloud CLI, pass the --include-prefixes and --exclude-prefixes flags to the gcloud transfer jobs create command or the gcloud transfer jobs update command:

gcloud transfer jobs create SOURCE DESTINATION \
  --include-prefixes="path_1/,path_2/" --exclude-prefixes="path_1/subpath_2/"

Separate multiple prefixes with commas, omitting spaces after the commas. For example, --include-prefixes=foo,bar.

REST

To specify include and exclude prefixes using the REST API, use the includePrefixes[] and excludePrefixes[] fields:

{
    "description": "YOUR DESCRIPTION",
    "status": "ENABLED",
    "projectId": "PROJECT_ID",
    "schedule": {
        "scheduleStartDate": {
            "day": 1,
            "month": 1,
            "year": 2015
        },
        "startTimeOfDay": {
            "hours": 1,
            "minutes": 1
        }
    },
    "transferSpec": {
        "gcsDataSource": {
            "bucketName": "GCS_SOURCE_NAME"
        },
        "gcsDataSink": {
            "bucketName": "GCS_SINK_NAME"
        },
        "transferOptions": {
            "deleteObjectsFromSourceAfterTransfer": true
        },
        "objectConditions": {
            "includePrefixes": [
                "path_1/",
                "path_2/"
            ],
            "excludePrefixes": [
                "path_1/subpath_2/object_5"
            ]
        }
    }
}

For more information, refer to the ObjectConditions reference.

Example objects and paths

The examples in this document use the following sample objects and paths:

xx://bucketname/object_1
xx://bucketname/object_2
xx://bucketname/path_1/object_3
xx://bucketname/path_2/object_4
xx://bucketname/path_1/subpath_1/object_5
xx://bucketname/path_1/subpath_2/object_6
xx://bucketname/path_2/subpath_3/object_7
xx://bucketname/path_2/subpath_4/object_8

Include prefixes

Use include prefixes when creating a transfer to instruct Storage Transfer Service to consider objects in the listed paths for transfer, and to ignore objects not under those paths.

For example, to include objects under path_1/, use the following prefix:

path_1/

This includes objects directly under path_1/, path_1/subpath_1, and path_1/subpath_2/. The following objects are included in the transfer:

xx://bucketname/object_1
xx://bucketname/object_2
xx://bucketname/path_1/object_3
xx://bucketname/path_2/object_4
xx://bucketname/path_1/subpath_1/object_5
xx://bucketname/path_1/subpath_2/object_6
xx://bucketname/path_2/subpath_3/object_7
xx://bucketname/path_2/subpath_4/object_8

You can specify multiple paths to include. For example, you can pass the following:

path_1/subpath_2/
path_1/subpath_3/

In this case, the transfer includes the following objects:

xx://bucketname/object_1
xx://bucketname/object_2
xx://bucketname/path_1/object_3
xx://bucketname/path_2/object_4
xx://bucketname/path_1/subpath_1/object_5
xx://bucketname/path_1/subpath_2/object_6
xx://bucketname/path_2/subpath_3/object_7
xx://bucketname/path_2/subpath_4/object_8

Partial matches are supported. For example, specifying path as the value of an include prefix matches the following objects:

xx://bucketname/object_1
xx://bucketname/object_2
xx://bucketname/path_1/object_3
xx://bucketname/path_2/object_4
xx://bucketname/path_1/subpath_1/object_5
xx://bucketname/path_1/subpath_2/object_6
xx://bucketname/path_2/subpath_3/object_7
xx://bucketname/path_2/subpath_4/object_8

When you use include prefixes, paths that you don't specifically include aren't transferred to the Cloud Storage destination bucket.

Exclude prefixes

Using exclude prefixes when creating a transfer instructs Storage Transfer Service to ignore the listed paths for transfer.

To exclude objects under path_1/, pass the following prefix:

path_1/

This excludes objects under path_1/, path_1/subpath_1/, and path_1/subpath_2/. In this case, the following objects are included in the transfer:

xx://bucketname/object_1
xx://bucketname/object_2
xx://bucketname/path_1/object_3
xx://bucketname/path_2/object_4
xx://bucketname/path_1/subpath_1/object_5
xx://bucketname/path_1/subpath_2/object_6
xx://bucketname/path_2/subpath_3/object_7
xx://bucketname/path_2/subpath_4/object_8

You can specify multiple paths to exclude. For example, you can pass the following:

path_1/subpath_2/
path_2/subpath_3/

In this case, the transfer includes the following objects:

xx://bucketname/object_1
xx://bucketname/object_2
xx://bucketname/path_1/object_3
xx://bucketname/path_2/object_4
xx://bucketname/path_1/subpath_1/object_5
xx://bucketname/path_1/subpath_2/object_6
xx://bucketname/path_2/subpath_3/object_7
xx://bucketname/path_2/subpath_4/object_8

Including and excluding paths simultaneously

You can apply an exclude prefix and an include prefix together, in which case the exclude prefix limits what the include prefix includes in the transfer.

When specifying both types of prefix, each exclude prefix must start with a path that is specified in an include prefix.

For example, to include objects under path_1/ and exclude objects under subpath_1/ pass the following:

include: path_1/
exclude: path_1/subpath_1/

In this case, the transfer includes the following objects:

xx://bucketname/object_1
xx://bucketname/object_2
xx://bucketname/path_1/object_3
xx://bucketname/path_2/object_4
xx://bucketname/path_1/subpath_1/object_5
xx://bucketname/path_1/subpath_2/object_6
xx://bucketname/path_2/subpath_3/object_7
xx://bucketname/path_2/subpath_4/object_8

To include all objects under path_1/ and path_2/, except items in either path_1/subpath_1/ or path_2/subpath_3/, pass the following:

include: path_1/
         path_2/
exclude: path_1/subpath_1/
         path_2/subpath_3/

In this case, the transfer includes the following objects:

xx://bucketname/object_1
xx://bucketname/object_2
xx://bucketname/path_1/object_3
xx://bucketname/path_2/object_4
xx://bucketname/path_1/subpath_1/object_5
xx://bucketname/path_1/subpath_2/object_6
xx://bucketname/path_2/subpath_3/object_7
xx://bucketname/path_2/subpath_4/object_8

Examples of incorrect path inclusion or exclusion

The following sections include examples to avoid when using include or exclude paths, and how to correct them so that they work correctly.

Including a path that is used in another include prefix

Each include prefix should specify a distinct portion of the object namespace. The following example is incorrect because the second value is already included in the namespace of the first value:

include: path_1/
         path_1/subpath_1

In this example, the include prefix values are invalid because the second include prefix, path_1/subpath_1, is already included within path_1/. To fix this, remove one of the values.

Using an exclude prefix that doesn't start with an include prefix

Each exclude prefix should start with any of the specified include prefix values. The following example is incorrect because the exclude prefix values don't start with the specified include prefix values:

include: path_1/
         path_2/
exclude: subpath_1
         subpath_4

In this example, the exclude prefix values are invalid because they don't start with either of the include prefix values. To fix this, ensure that the exclude prefix includes a full path listed as an include prefix:

include: path_1/
         path_2/
exclude: path_1/subpath_1/
         path_2/subpath_4/

What's next