Google Cloud CLI examples

This page provides examples of using the Google Cloud CLI to perform common Cloud Life Sciences API operations.

Full details are available in the gcloud beta lifesciences command reference.

Outputting a command to a Cloud Storage bucket

When running the Cloud Life Sciences API using gcloud, you can specify single commands using the --command-line flag:

gcloud beta lifesciences pipelines run \
    --location us-central1 \
    --regions us-east1 \
    --logging gs://BUCKET/my-path/example.log \
    --command-line 'echo "hello world!"'

This command calls the Cloud Life Sciences API in the us-central1 region and creates a VM in the us-east1 region. All operation metadata is stored in us-central1, so to interact with the operation later on (for example, to get its details), you must specify the us-central1 region.

After the VM starts, it pulls the Google Cloud CLI Docker image and runs the command: echo "hello world!". The output of the command is written to a log file in a Cloud Storage bucket specified with the --logging flag.

To use an image other than gcloud CLI, use the --docker-image flag. For example, the Quickstart uses the gcr.io/cloud-lifesciences/samtools image.

Checking the status of a pipeline

After you run a pipeline using the gcloud CLI, the command returns the following:

Running [projects/PROJECT_ID/locations/LOCATION/operations/OPERATION_ID]

You can use the OPERATION_ID value to check the status of the pipeline by running the following command:

gcloud beta lifesciences operations describe OPERATION_ID

Running the operations describe command provides the following details about the pipeline:

  • Whether it has started
  • Whether it is in progress
  • Whether it finished successfully or encountered errors

To see only whether the operation has completed, use the --format flag:

gcloud --format="value(done)" beta lifesciences operations describe OPERATION_ID

The gcloud CLI provides other features for filtering operations and formatting the displayed values. Read the documentation for the --filter and --format flags for more information.

Cancelling a pipeline run

To cancel a running pipeline, run gcloud beta lifesciences operations cancel and provide the operation ID:

gcloud beta lifesciences operations cancel OPERATION_ID

Note that the operation will not be immediately marked as done. The Compute Engine VM must be deleted before the operation completes, and this might take a few minutes.

Passing input parameters

Use the --inputs flag to pass input parameters to the Docker image:

gcloud beta lifesciences pipelines run \
    --regions us-east1 \
    --logging gs://BUCKET/my-path/example.log \
    --command-line 'echo "${MESSAGE}"' \
    --inputs MESSAGE='hello world!'

The parameters are passed by name as environment variables to the command running inside the Docker container.

Specifying input and output files

Use the --inputs flag to specify input files:

gcloud beta lifesciences pipelines run \
    --regions us-east1 \
    --logging gs://BUCKET/my-path/example.log \
    --command-line 'cat ${INPUT_FILE} | md5sum' \
    --inputs INPUT_FILE=gs://BUCKET/INPUT_FILE

The BUCKET/my-path/example.log log file contains the resulting md5sum.

Use the --outputs flag to write the results of the command to a file in Cloud Storage:

gcloud beta lifesciences pipelines run \
    --regions us-east1 \
    --logging gs://BUCKET/my-path/example.log \
    --command-line 'cat "${INPUT_FILE}" | md5sum > "${OUTPUT_FILE}"' \
    --inputs INPUT_FILE=gs://BUCKET/INPUT_FILE
    --outputs OUTPUT_FILE=gs://BUCKET/OUTPUT_FILE.md5

Input and output files can contain wildcards, so you can pass multiple files at once. Recursive copy is not available.

Using preemptible VMs

You can use a preemptible VM, which can be up to 80% cheaper than a regular VM. However, Compute Engine might terminate (preempt) this VM if Compute Engine requires access to the VM's resources for other tasks. If your VM is preempted, you will have to restart your pipeline.

To use a preemptible VM, use the --preemptible flag when running your pipeline:

gcloud beta lifesciences pipelines run \
    --regions us-east1 \
    --logging gs://BUCKET/my-path/example.log \
    --command-line 'echo "hello world!"' \
    --preemptible

Setting VM instance types

By default, the Compute Engine VM that runs the pipeline will be an n1-standard-1. You can specify the amount of memory and the number of CPU cores by selecting a machine type with the --machine-type flag:

gcloud beta lifesciences pipelines run \
    --regions us-east1 \
    --logging gs://BUCKET/my-path/example.log \
    --command-line 'echo "hello world!"' \
    --machine-type MACHINE_TYPE

If you are specifying your pipeline using a YAML or JSON file, you can specify any supported Compute Engine machine type by entering it in the machineType field.

Writing complex pipeline definitions

For more complex pipelines, define the pipeline in a YAML or JSON file and pass the file to the gcloud CLI with the --pipeline-file flag. The file must contain a single Pipeline message, as described in the reference documentation. Defining your pipeline this way lets you specify advanced features including multiple disks and background containers.

To convert the command from the hello world! example to a pipeline file, copy the following text and save it to a file named hello.yaml:

actions:
- commands: [ '-c', 'echo "Hello, world!"' ]
  imageUri: bash

Run the hello.yaml file by passing the --pipeline-file flag to the gcloud CLI:

gcloud beta lifesciences pipelines run \
    --regions us-east1 \
    --logging gs://BUCKET/my-path/example.log \
    --pipeline-file hello.yaml

Using multiple Docker containers

The preceding examples focus on running a single command using a single Docker container. If the pipeline requires multiple separate commands from multiple containers, you must specify a pipeline in a YAML or JSON file.

You can add more steps to a YAML or JSON configuration file by entering them in the actions list. To run a pipeline using two Docker containers, add the following to a YAML configuration file:

actions:
- commands: [ 'echo', 'Hello from bash!' ]
  imageUri: bash
- commands: [ 'echo', 'Hello from ubuntu!' ]
  imageUri: ubuntu

When you run the pipeline, the first command runs in the bash Docker image, and the second command runs in the ubuntu Docker image.