perfdiag - Run performance diagnostic
Synopsis
gsutil perfdiag [-i in.json] gsutil perfdiag [-o out.json] [-n objects] [-c processes] [-k threads] [-p parallelism type] [-y slices] [-s size] [-d directory] [-t tests] [-j ratio] gs://<bucket_name>...
Description
The perfdiag
command runs a suite of diagnostic tests for a given Cloud Storage
bucket.
The bucket_name
parameter must name an existing bucket to which the user
has write permission. Several test files will be uploaded to and downloaded
from this bucket. All test files will be deleted at the completion of the
diagnostic if it finishes successfully. For a list of relevant permissions,
see Cloud IAM permissions for gsutil commands.
gsutil performance can be influenced by a number of factors originating at the client, server, or network level. Some examples include the following:
CPU speed
Available memory
The access path to the local disk
Network bandwidth
Contention and error rates along the path between gsutil and Google servers
Operating system buffering configuration
Firewalls and other network elements
The perfdiag command is provided so that customers can run a known measurement suite when troubleshooting performance problems.
Providing Diagnostic Output To The Cloud Storage Team
If the Cloud Storage team asks you to run a performance diagnostic please use the following command, and email the output file (output.json) to the @google.com address provided by the Cloud Storage team:
gsutil perfdiag -o output.json gs://your-bucket
Additional resources for discussing perfdiag
results include the
Stack Overflow tag for Cloud Storage and
the gsutil GitHub repository.
Options
- -n
Sets the number of objects to use when downloading and uploading files during tests. Defaults to 5.
- -c
Sets the number of processes to use while running throughput experiments. The default value is 1.
- -k
Sets the number of threads per process to use while running throughput experiments. Each process will receive an equal number of threads. The default value is 1.
- -p
Sets the type of parallelism to be used (only applicable when threads or processes are specified and threads * processes > 1). The default is to use
fan
. Must be one of the following:- fan
Use one thread per object. This is akin to using gsutil
-m cp
, with sliced object download / parallel composite upload disabled.- slice
Use Y (specified with
-y
) threads for each object, transferring one object at a time. This is akin to using parallel object download / parallel composite upload, without-m
. Sliced uploads not supported for s3.- both
Use Y (specified with
-y
) threads for each object, transferring multiple objects at a time. This is akin to simultaneously using sliced object download / parallel composite upload andgsutil -m cp
. Parallel composite uploads not supported for s3.
- -y
Sets the number of slices to divide each file/object into while transferring data. Only applicable with the slice (or both) parallelism type. The default is 4 slices.
- -s
Sets the size (in bytes) for each of the N (set with
-n
) objects used in the read and write throughput tests. The default is 1 MiB. This can also be specified using byte suffixes such as 500K or 1M.- -d
Sets the directory to store temporary local files in. If not specified, a default temporary directory will be used.
- -t
Sets the list of diagnostic tests to perform. The default is to run the
lat
,rthru
, andwthru
diagnostic tests. Must be a comma-separated list containing one or more of the following:- lat
For N (set with
-n
) objects, write the object, retrieve its metadata, read the object, and finally delete the object. Record the latency of each operation.- list
Write N (set with
-n
) objects to the bucket, record how long it takes for the eventually consistent listing call to return the N objects in its result, delete the N objects, then record how long it takes listing to stop returning the N objects.- rthru
Runs N (set with
-n
) read operations, with at most C (set with -c) reads outstanding at any given time.- rthru_file
The same as
rthru
, but simultaneously writes data to the disk, to gauge the performance impact of the local disk on downloads.- wthru
Runs N (set with
-n
) write operations, with at most C (set with-c
) writes outstanding at any given time.- wthru_file
The same as wthru, but simultaneously reads data from the disk, to gauge the performance impact of the local disk on uploads.
- -m
Adds metadata to the result JSON file. Multiple
-m
values can be specified. Example:gsutil perfdiag -m "key1:val1" -m "key2:val2" gs://bucketname
Each metadata key will be added to the top-level "metadata" dictionary in the output JSON file.
- -o
Writes the results of the diagnostic to an output file. The output is a JSON file containing system information and performance diagnostic results. The file can be read and reported later using the
-i
option.- -i
Reads the JSON output file created using the
-o
command and prints a formatted description of the results.- -j
Applies gzip transport encoding and sets the target compression ratio for the generated test files. This ratio can be an integer between 0 and 100 (inclusive), with 0 generating a file with uniform data, and 100 generating random data. When you specify the
-j
option, files being uploaded are compressed in-memory and on-the-wire only. See cp -j for specific semantics.
Measuring Availability
The perfdiag
command ignores the boto num_retries configuration parameter.
Instead, it always retries on HTTP errors in the 500 range and keeps track of
how many 500 errors were encountered during the test. The availability
measurement is reported at the end of the test.
Note that HTTP responses are only recorded when the request was made in a single process. When using multiple processes or threads, read and write throughput measurements are performed in an external process, so the availability numbers reported won't include the throughput measurements.
Note
The perfdiag
command runs a series of tests that collects system information,
such as the following:
Retrieves requester's IP address.
Executes DNS queries to Google servers and collects the results.
Collects network statistics information from the output of
netstat -s
and evaluates the BIOS product name string.If a proxy server is configured, attempts to connect to it to retrieve the location and storage class of the bucket being used for performance testing.
None of this information will be sent to Google unless you proactively choose to send it.