Shim for Running gcloud storage
Overview
Cloud SDK includes a new CLI, gcloud storage, that can be considerably faster than gsutil when performing uploads and downloads with less parameter tweaking. This new CLI has a syntax and command structure that is familiar to gsutil users but is fundamentally different in many important ways. To ease transition to this new CLI, gsutil provides a shim that translates your gsutil commands to gcloud storage commands if an equivalent exists, and falls back to gsutil's usual behavior if an equivalent does not exist.
To Enable
Set use_gcloud_storage=True
in the .boto
config file under the
[GSUtil]
section:
[GSUtil] use_gcloud_storage=True
You can also set the flag for individual commands using the top-level -o
flag:
gsutil -o "GSUtil:use_gcloud_storage=True" -m cp -p file gs://bucket/obj
Available Commands
The gcloud storage CLI only supports a subset of gsutil commands. What follows is a list of commands supported by the shim with any differences in behavior noted.
acl
The
ch
subcommand is not supported.
autoclass
Works as expected.
bucketpolicyonly
Works as expected.
cat
Prints object data for a second object even if the first object is invalid.
compose
Works as expected.
cors
get
subcommand prints "[]" instead of "gs://[bucket name] has no CORS configuration".
cp
Copies a second object even if the first object is invalid.
Does not support file to file copies.
Supports copying objects cloud-to-cloud with trailing slashes in the name.
The all-version flag (
-A
) silently enables sequential execution rather than raising an error.
defacl
The
ch
subcommand is not supported.
defstorageclass
Works as expected.
hash
In gsutil, the
-m
and-c
flags that affect which hashes are displayed are ignored for cloud objects. This behavior is fixed for the shim and gcloud storage.
iam
The
ch
subcommand is not supported.The
-f
flag will continue on any error, not just API errors.
kms
The authorize subcommand returns informational messages in a different format.
The encryption subcommand returns informational messages in a different format.
labels
get
subcommand prints "[]" instead of "gs://[bucket name] has no labels configuration."
lifecycle
Works as expected.
logging
The get subcommand has different JSON spacing and doesn't print an informational message if no configuration is found.
ls
Works as expected.
mb
Works as expected.
mv
See notes on cp.
notification
The list subcommand prints configuration information as YAML.
The delete subcommand offers progress tracking and parallelization.
pap
Works as expected.
rb
Works as expected.
requesterpays
Works as expected.
rewrite
The -k flag does not throw an error if called without a new key. In both the shim and unshimmed cases, the old key is maintained.
rm
$folder$
delete markers are not supported.
rpo
Works as expected.
setmeta
Does not throw an error if no headers are changed.
stat
Includes a field "Storage class update time:" which may throw off tabbing.
ubla
Works as expected.
versioning
Works as expected.
web
The get subcommand has different JSON spacing and doesn't print an informational message if no configuration is found.
Boto Configuration
Configuration found in the boto file is mapped 1:1 to gcloud environment variables where appropriate.
[Credentials]
aws_access_key_id: AWS_ACCESS_KEY_ID
aws_secret_access_key: AWS_SECRET_ACCESS_KEY
use_client_certificate: CLOUDSDK_CONTEXT_AWARE_USE_CLIENT_CERTIFICATE
[Boto]
proxy: CLOUDSDK_PROXY_ADDRESS
proxy_type: CLOUDSDK_PROXY_TYPE
proxy_port: CLOUDSDK_PROXY_PORT
proxy_user: CLOUDSDK_PROXY_USERNAME
proxy_pass: CLOUDSDK_PROXY_PASSWORD
proxy_rdns: CLOUDSDK_PROXY_RDNS
http_socket_timeout: CLOUDSDK_CORE_HTTP_TIMEOUT
ca_certificates_file: CLOUDSDK_CORE_CUSTOM_CA_CERTS_FILE
max_retry_delay: CLOUDSDK_STORAGE_BASE_RETRY_DELAY
num_retries: CLOUDSDK_STORAGE_MAX_RETRIES
[GSUtil]
check_hashes: CLOUDSDK_STORAGE_CHECK_HASHES
default_project_id: CLOUDSDK_CORE_PROJECT
disable_analytics_prompt: CLOUDSDK_CORE_DISABLE_USAGE_REPORTING
use_magicfile: CLOUDSDK_STORAGE_USE_MAGICFILE
parallel_composite_upload_threshold: CLOUDSDK_STORAGE_PARALLEL_COMPOSITE_UPLOAD_THRESHOLD
resumable_threshold: CLOUDSDK_STORAGE_RESUMABLE_THRESHOLD
[OAuth2]
client_id: CLOUDSDK_AUTH_CLIENT_ID
client_secret: CLOUDSDK_AUTH_CLIENT_SECRET
provider_authorization_uri: CLOUDSDK_AUTH_AUTH_HOST
provider_token_uri: CLOUDSDK_AUTH_TOKEN_HOST
General Compatibility Notes
Due to its compatibility across all major platforms, multiprocessing is enabled for all commands by default (equivalent to the -m option always being included in gsutil).
A sequence of asterisks greater than 2 (i.e.
***
) are always treated as a single asterisk.Unlike gsutil, gcloud is not designed to be used in parallel invocations, and doing so (i.e. running the shim from 2 terminals at once) can lead to unpredictable behavior.
Assuming a bucket contains an object
gs://bucket/nested/foo.txt
, gsutil's wildcard iterator will matchfoo.txt
given a URL likegs://bucket/*/nested/*
. The shim will not matchfoo.txt
given the same URL.This will be updated as new commands are supported by both gcloud storage and the shim.
If Unicode is having issues, try setting the environment variable
PYTHONUTF8
to1
. Specifically, this may help on the Windows command-line (CMD).