This page provides examples of using the Google Cloud CLI to perform common Cloud Life Sciences API operations.
Full details are available in the gcloud beta lifesciences
command reference.
Outputting a command to a Cloud Storage bucket
When running the Cloud Life Sciences API using gcloud, you can specify single
commands using the --command-line
flag:
gcloud beta lifesciences pipelines run \ --location us-central1 \ --regions us-east1 \ --logging gs://BUCKET/my-path/example.log \ --command-line 'echo "hello world!"'
This command calls the Cloud Life Sciences API in the us-central1
region and
creates a VM in the us-east1
region. All operation metadata is stored in
us-central1
, so to interact with the operation later on (for example, to
get its details), you must specify the us-central1
region.
After the VM starts, it pulls the
Google Cloud CLI Docker image and
runs the command: echo "hello world!"
. The
output of the command is written to a log file in a Cloud Storage bucket
specified with the --logging
flag.
To use an image other than gcloud CLI, use the --docker-image
flag. For example, the Quickstart uses the
gcr.io/cloud-lifesciences/samtools image.
Checking the status of a pipeline
After you run a pipeline using the gcloud CLI, the command returns the following:
Running [projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID]
You can use the OPERATION_ID value to check the status of the pipeline by running the following command:
gcloud beta lifesciences operations describe OPERATION_ID
Running the operations describe
command provides the following details about
the pipeline:
- Whether it has started
- Whether it is in progress
- Whether it finished successfully or encountered errors
To see only whether the operation has completed, use the --format
flag:
gcloud --format="value(done)" beta lifesciences operations describe OPERATION_ID
The gcloud CLI provides other features for filtering operations and
formatting the displayed values. Read the documentation for the
--filter
and --format
flags for more information.
Cancelling a pipeline run
To cancel a running pipeline, run gcloud beta lifesciences operations cancel
and provide the operation ID:
gcloud beta lifesciences operations cancel OPERATION_ID
Note that the operation will not be immediately marked as done
. The
Compute Engine VM must
be deleted before the operation completes, and this might take a few minutes.
Passing input parameters
Use the --inputs
flag to pass input parameters to the Docker image:
gcloud beta lifesciences pipelines run \ --regions us-east1 \ --logging gs://BUCKET/my-path/example.log \ --command-line 'echo "${MESSAGE}"' \ --inputs MESSAGE='hello world!'
The parameters are passed by name as environment variables to the command running inside the Docker container.
Specifying input and output files
Use the --inputs
flag to specify input files:
gcloud beta lifesciences pipelines run \ --regions us-east1 \ --logging gs://BUCKET/my-path/example.log \ --command-line 'cat ${INPUT_FILE} | md5sum' \ --inputs INPUT_FILE=gs://BUCKET/INPUT_FILE
The BUCKET/my-path/example.log
log file contains
the resulting md5sum.
Use the --outputs
flag to write the results of the command to a file in
Cloud Storage:
gcloud beta lifesciences pipelines run \ --regions us-east1 \ --logging gs://BUCKET/my-path/example.log \ --command-line 'cat "${INPUT_FILE}" | md5sum > "${OUTPUT_FILE}"' \ --inputs INPUT_FILE=gs://BUCKET/INPUT_FILE --outputs OUTPUT_FILE=gs://BUCKET/OUTPUT_FILE.md5
Input and output files can contain wildcards, so you can pass multiple files at once. Recursive copy is not available.
Using preemptible VMs
You can use a preemptible VM, which can be up to 80% cheaper than a regular VM. However, Compute Engine might terminate (preempt) this VM if Compute Engine requires access to the VM's resources for other tasks. If your VM is preempted, you will have to restart your pipeline.
To use a preemptible VM, use the --preemptible
flag when running your
pipeline:
gcloud beta lifesciences pipelines run \ --regions us-east1 \ --logging gs://BUCKET/my-path/example.log \ --command-line 'echo "hello world!"' \ --preemptible
Setting VM instance types
By default, the Compute Engine VM that runs the pipeline will be an
n1-standard-1.
You can specify the amount of memory and the number of
CPU cores by selecting a machine type with the --machine-type
flag:
gcloud beta lifesciences pipelines run \ --regions us-east1 \ --logging gs://BUCKET/my-path/example.log \ --command-line 'echo "hello world!"' \ --machine-type MACHINE_TYPE
If you are specifying your pipeline using a YAML or JSON file,
you can specify any supported Compute Engine machine type by entering
it in the
machineType
field.
Writing complex pipeline definitions
For more complex pipelines, define the pipeline in a YAML or
JSON file and pass the file to the
gcloud CLI with the --pipeline-file
flag. The file must contain a single
Pipeline
message, as described in the
reference documentation.
Defining your pipeline this way lets you specify advanced features
including multiple disks and background containers.
To convert the command from the hello world!
example to a pipeline file, copy the following text and save it to a file
named hello.yaml
:
actions: - commands: [ '-c', 'echo "Hello, world!"' ] imageUri: bash
Run the hello.yaml
file by passing the --pipeline-file
flag to
the gcloud CLI:
gcloud beta lifesciences pipelines run \ --regions us-east1 \ --logging gs://BUCKET/my-path/example.log \ --pipeline-file hello.yaml
Using multiple Docker containers
The preceding examples focus on running a single command using a single Docker container. If the pipeline requires multiple separate commands from multiple containers, you must specify a pipeline in a YAML or JSON file.
You can add more steps to a YAML or JSON configuration file by entering them
in the actions
list. To run a pipeline using two Docker containers, add the
following to a YAML configuration file:
actions: - commands: [ 'echo', 'Hello from bash!' ] imageUri: bash - commands: [ 'echo', 'Hello from ubuntu!' ] imageUri: ubuntu
When you run the pipeline, the first command runs in the bash
Docker
image, and the second command runs in the ubuntu
Docker image.