Package google.genomics.v1alpha2

Index

PipelinesV1Alpha2

A service for running genomics pipelines.

CreatePipeline

rpc CreatePipeline(CreatePipelineRequest) returns (Pipeline)

Creates a pipeline that can be run later. Create takes a Pipeline that has all fields other than pipelineId populated, and then returns the same pipeline with pipelineId populated. This id can be used to run the pipeline.

Caller must have WRITE permission to the project.

Authorization Scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-platform
  • https://www.googleapis.com/auth/genomics

For more information, see the Authentication Overview.

DeletePipeline

rpc DeletePipeline(DeletePipelineRequest) returns (Empty)

Deletes a pipeline based on ID.

Caller must have WRITE permission to the project.

Authorization Scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-platform
  • https://www.googleapis.com/auth/genomics

For more information, see the Authentication Overview.

GetControllerConfig

rpc GetControllerConfig(GetControllerConfigRequest) returns (ControllerConfig)

Gets controller configuration information. Should only be called by VMs created by the Pipelines Service and not by end users.

Authorization Scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-platform
  • https://www.googleapis.com/auth/genomics

For more information, see the Authentication Overview.

GetPipeline

rpc GetPipeline(GetPipelineRequest) returns (Pipeline)

Retrieves a pipeline based on ID.

Caller must have READ permission to the project.

Authorization Scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-platform
  • https://www.googleapis.com/auth/genomics

For more information, see the Authentication Overview.

ListPipelines

rpc ListPipelines(ListPipelinesRequest) returns (ListPipelinesResponse)

Lists pipelines.

Caller must have READ permission to the project.

Authorization Scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-platform
  • https://www.googleapis.com/auth/genomics

For more information, see the Authentication Overview.

RunPipeline

rpc RunPipeline(RunPipelineRequest) returns (Operation)

Runs a pipeline. If pipelineId is specified in the request, then run a saved pipeline. If ephemeralPipeline is specified, then run that pipeline once without saving a copy.

The caller must have READ permission to the project where the pipeline is stored and WRITE permission to the project where the pipeline will be run, as VMs will be created and storage will be used.

If a pipeline operation is still running after 6 days, it will be canceled.

Authorization Scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/compute
  • https://www.googleapis.com/auth/cloud-platform
  • https://www.googleapis.com/auth/genomics

For more information, see the Authentication Overview.

SetOperationStatus

rpc SetOperationStatus(SetOperationStatusRequest) returns (Empty)

Sets status of a given operation. Any new timestamps (as determined by description) are appended to TimestampEvents. Should only be called by VMs created by the Pipelines Service and not by end users.

Authorization Scopes

Requires one of the following OAuth scopes:

  • https://www.googleapis.com/auth/cloud-platform
  • https://www.googleapis.com/auth/genomics

For more information, see the Authentication Overview.

ComputeEngine

Describes a Compute Engine resource that is being managed by a running pipeline.

Fields
instance_name

string

The instance on which the operation is running.

zone

string

The availability zone in which the instance resides.

machine_type

string

The machine type of the instance.

disk_names[]

string

The names of the disks that were created for this pipeline.

ControllerConfig

Stores the information that the controller will fetch from the server in order to run. Should only be used by VMs created by the Pipelines Service and not by end users.

Fields
image

string

cmd

string

gcs_log_path

string

machine_type

string

vars

map<string, string>

disks

map<string, string>

gcs_sources

map<string, RepeatedString>

gcs_sinks

map<string, RepeatedString>

RepeatedString

Fields
values[]

string

CreatePipelineRequest

The request to create a pipeline. The pipeline field here should not have pipelineId populated, as that will be populated by the server.

Fields
pipeline

Pipeline

The pipeline to create. Should not have pipelineId populated.

DeletePipelineRequest

The request to delete a saved pipeline by ID.

Fields
pipeline_id

string

Caller must have WRITE access to the project in which this pipeline is defined.

DockerExecutor

The Docker execuctor specification.

Fields
image_name

string

Required. Image name from either Docker Hub or Google Container Registry. Users that run pipelines must have READ access to the image.

cmd

string

Required. The command or newline delimited script to run. The command string will be executed within a bash shell.

If the command exits with a non-zero exit code, output parameter de-localization will be skipped and the pipeline operation's error field will be populated.

Maximum command string length is 16384.

GetControllerConfigRequest

Request to get controller configuation. Should only be used by VMs created by the Pipelines Service and not by end users.

Fields
operation_id

string

The operation to retrieve controller configuration for.

validation_token

uint64

GetPipelineRequest

A request to get a saved pipeline by id.

Fields
pipeline_id

string

Caller must have READ access to the project in which this pipeline is defined.

ListPipelinesRequest

A request to list pipelines in a given project. Pipelines can be filtered by name using namePrefix: all pipelines with names that begin with namePrefix will be returned. Uses standard pagination: pageSize indicates how many pipelines to return, and pageToken comes from a previous ListPipelinesResponse to indicate offset.

Fields
project_id

string

Required. The name of the project to search for pipelines. Caller must have READ access to this project.

name_prefix

string

Pipelines with names that match this prefix should be returned. If unspecified, all pipelines in the project, up to pageSize, will be returned.

page_size

int32

Number of pipelines to return at once. Defaults to 256, and max is 2048.

page_token

string

Token to use to indicate where to start getting results. If unspecified, returns the first page of results.

ListPipelinesResponse

The response of ListPipelines. Contains at most pageSize pipelines. If it contains pageSize pipelines, and more pipelines exist, then nextPageToken will be populated and should be used as the pageToken argument to a subsequent ListPipelines request.

Fields
pipelines[]

Pipeline

The matched pipelines.

next_page_token

string

The token to use to get the next page of results.

LoggingOptions

The logging options for the pipeline run.

Fields
gcs_path

string

The location in Google Cloud Storage to which the pipeline logs will be copied. Can be specified as a fully qualified directory path, in which case logs will be output with a unique identifier as the filename in that directory, or as a fully specified path, which must end in .log, in which case that path will be used, and the user must ensure that logs are not overwritten. Stdout and stderr logs from the run are also generated and output as -stdout.log and -stderr.log.

Pipeline

The pipeline object. Represents a transformation from a set of input parameters to a set of output parameters. The transformation is defined as a docker image and command to run within that image. Each pipeline is run on a Google Compute Engine VM. A pipeline can be created with the create method and then later run with the run method, or a pipeline can be defined and run all at once with the run method.

Fields
project_id

string

Required. The project in which to create the pipeline. The caller must have WRITE access.

name

string

Required. A user specified pipeline name that does not have to be unique. This name can be used for filtering Pipelines in ListPipelines.

description

string

User-specified description.

input_parameters[]

PipelineParameter

Input parameters of the pipeline.

output_parameters[]

PipelineParameter

Output parameters of the pipeline.

resources

PipelineResources

Required. Specifies resource requirements for the pipeline run. Required fields:

* minimumCpuCores

* minimumRamGb

pipeline_id

string

Unique pipeline id that is generated by the service when CreatePipeline is called. Cannot be specified in the Pipeline used in the CreatePipelineRequest, and will be populated in the response to CreatePipeline and all subsequent Get and List calls. Indicates that the service has registered this pipeline.

docker

DockerExecutor

Specifies the docker run information.

PipelineParameter

Parameters facilitate setting and delivering data into the pipeline's execution environment. They are defined at create time, with optional defaults, and can be overridden at run time.

If localCopy is unset, then the parameter specifies a string that is passed as-is into the pipeline, as the value of the environment variable with the given name. A default value can be optionally specified at create time. The default can be overridden at run time using the inputs map. If no default is given, a value must be supplied at runtime.

If localCopy is defined, then the parameter specifies a data source or sink, both in Google Cloud Storage and on the Docker container where the pipeline computation is run. The service account associated with the Pipeline (by default the project's Compute Engine service account) must have access to the Google Cloud Storage paths.

At run time, the Google Cloud Storage paths can be overridden if a default was provided at create time, or must be set otherwise. The pipeline runner should add a key/value pair to either the inputs or outputs map. The indicated data copies will be carried out before/after pipeline execution, just as if the corresponding arguments were provided to gsutil cp.

For example: Given the following PipelineParameter, specified in the inputParameters list:

{name: "input_file", localCopy: {path: "file.txt", disk: "pd1"}}

where disk is defined in the PipelineResources object as:

{name: "pd1", mountPoint: "/mnt/disk/"}

We create a disk named pd1, mount it on the host VM, and map /mnt/pd1 to /mnt/disk in the docker container. At runtime, an entry for input_file would be required in the inputs map, such as:

  inputs["input_file"] = "gs://my-bucket/bar.txt"

This would generate the following gsutil call:

  gsutil cp gs://my-bucket/bar.txt /mnt/pd1/file.txt

The file /mnt/pd1/file.txt maps to /mnt/disk/file.txt in the Docker container. Acceptable paths are:

Google Cloud storage pathLocal path
filefile
globdirectory

For outputs, the direction of the copy is reversed:

  gsutil cp /mnt/disk/file.txt gs://my-bucket/bar.txt

Acceptable paths are:

Local pathGoogle Cloud Storage path
filefile
file directory - directory must already exist
glob directory - directory will be created if it doesn't exist

One restriction due to docker limitations, is that for outputs that are found on the boot disk, the local path cannot be a glob and must be a file.

Fields
name

string

Required. Name of the parameter - the pipeline runner uses this string as the key to the input and output maps in RunPipeline.

description

string

Human-readable description.

default_value

string

The default value for this parameter. Can be overridden at runtime. If localCopy is present, then this must be a Google Cloud Storage path beginning with gs://.

local_copy

LocalCopy

If present, this parameter is marked for copying to and from the VM. LocalCopy indicates where on the VM the file should be. The value given to this parameter (either at runtime or using defaultValue) must be the remote path where the file should be.

LocalCopy

LocalCopy defines how a remote file should be copied to and from the VM.

Fields
path

string

Required. The path within the user's docker container where this input should be localized to and from, relative to the specified disk's mount point. For example: file.txt,

disk

string

Required. The name of the disk where this parameter is located. Can be the name of one of the disks specified in the Resources field, or "boot", which represents the Docker instance's boot disk and has a mount point of /.

PipelineResources

The system resources for the pipeline run.

Fields
minimum_cpu_cores

int32

The minimum number of cores to use. Defaults to 1.

preemptible

bool

Whether to use preemptible VMs. Defaults to false. In order to use this, must be true for both create time and run time. Cannot be true at run time if false at create time.

minimum_ram_gb

double

The minimum amount of RAM to use. Defaults to 3.75 (GB)

disks[]

Disk

Disks to attach.

zones[]

string

List of Google Compute Engine availability zones to which resource creation will restricted. If empty, any zone may be chosen.

boot_disk_size_gb

int32

The size of the boot disk. Defaults to 10 (GB).

no_address

bool

Whether to assign an external IP to the instance. This is an experimental feature that may go away. Defaults to false. Corresponds to --no_address flag for gcloud compute instances create. In order to use this, must be true for both create time and run time. Cannot be true at run time if false at create time. If you need to ssh into a private IP VM for debugging, you can ssh to a public VM and then ssh into the private VM's Internal IP. If noAddress is set, this pipeline run may only load docker images from Google Container Registry and not Docker Hub. Before using this, you must configure access to Google services from internal IPs.

accelerator_type

string

Optional. The Compute Engine defined accelerator type. By specifying this parameter, you will download and install the following third-party software onto your managed Compute Engine instances: NVIDIA® Tesla® drivers and NVIDIA® CUDA toolkit. Please see https://cloud.google.com/compute/docs/gpus/ for a list of available accelerator types.

accelerator_count

int64

Optional. The number of accelerators of the specified type to attach. By specifying this parameter, you will download and install the following third-party software onto your managed Compute Engine instances: NVIDIA® Tesla® drivers and NVIDIA® CUDA toolkit.

Disk

A Google Compute Engine disk resource specification.

Fields
name

string

Required. The name of the disk that can be used in the pipeline parameters. Must be 1 - 63 characters. The name "boot" is reserved for system use.

type

Type

Required. The type of the disk to create.

size_gb

int32

The size of the disk. Defaults to 500 (GB). This field is not applicable for local SSD.

source

string

The full or partial URL of the persistent disk to attach. See https://cloud.google.com/compute/docs/reference/latest/instances#resource and https://cloud.google.com/compute/docs/disks/persistent-disks#snapshots for more details.

auto_delete

bool

Deprecated. Disks created by the Pipelines API will be deleted at the end of the pipeline run, regardless of what this field is set to.

read_only

bool

Specifies how a sourced-base persistent disk will be mounted. See https://cloud.google.com/compute/docs/disks/persistent-disks#use_multi_instances for more details. Can only be set at create time.

mount_point

string

Required at create time and cannot be overridden at run time. Specifies the path in the docker container where files on this disk should be located. For example, if mountPoint is /mnt/disk, and the parameter has localPath inputs/file.txt, the docker container can access the data at /mnt/disk/inputs/file.txt.

Type

The types of disks that may be attached to VMs.

Enums
TYPE_UNSPECIFIED Default disk type. Use one of the other options below.
PERSISTENT_HDD Specifies a Google Compute Engine persistent hard disk. See https://cloud.google.com/compute/docs/disks/#pdspecs for details.
PERSISTENT_SSD Specifies a Google Compute Engine persistent solid-state disk. See https://cloud.google.com/compute/docs/disks/#pdspecs for details.
LOCAL_SSD Specifies a Google Compute Engine local SSD. See https://cloud.google.com/compute/docs/disks/local-ssd for details.

RunPipelineArgs

The pipeline run arguments.

Fields
project_id

string

Required. The project in which to run the pipeline. The caller must have WRITER access to all Google Cloud services and resources (e.g. Google Compute Engine) will be used.

inputs

map<string, string>

Pipeline input arguments; keys are defined in the pipeline documentation. All input parameters that do not have default values must be specified. If parameters with defaults are specified here, the defaults will be overridden.

outputs

map<string, string>

Pipeline output arguments; keys are defined in the pipeline documentation. All output parameters of without default values must be specified. If parameters with defaults are specified here, the defaults will be overridden.

service_account

ServiceAccount

The Google Cloud Service Account that will be used to access data and services. By default, the compute service account associated with projectId is used.

client_id

string

This field is deprecated. Use labels instead. Client-specified pipeline operation identifier.

resources

PipelineResources

Specifies resource requirements/overrides for the pipeline run.

logging

LoggingOptions

Required. Logging options. Used by the service to communicate results to the user.

keep_vm_alive_on_failure_duration

Duration

How long to keep the VM up after a failure (for example docker command failed, copying input or output files failed, etc). While the VM is up, one can ssh into the VM to debug. Default is 0; maximum allowed value is 1 day.

labels

map<string, string>

Labels to apply to this pipeline run. Labels will also be applied to compute resources (VM, disks) created by this pipeline run. When listing operations, operations can filtered by labels. Label keys may not be empty; label values may be empty. Non-empty labels must be 1-63 characters long, and comply with RFC1035. Specifically, the name must be 1-63 characters long and match the regular expression [a-z]([-a-z0-9]*[a-z0-9])? which means the first character must be a lowercase letter, and all following characters must be a dash, lowercase letter, or digit, except the last character, which cannot be a dash.

RunPipelineRequest

The request to run a pipeline. If pipelineId is specified, it refers to a saved pipeline created with CreatePipeline and set as the pipelineId of the returned Pipeline object. If ephemeralPipeline is specified, that pipeline is run once with the given args and not saved. It is an error to specify both pipelineId and ephemeralPipeline. pipelineArgs must be specified.

Fields
pipeline_args

RunPipelineArgs

The arguments to use when running this pipeline.

Union field pipeline.

pipeline can be only one of the following:

pipeline_id

string

The already created pipeline to run.

ephemeral_pipeline

Pipeline

A new pipeline object to run once and then delete.

RuntimeMetadata

Runtime metadata that will be populated in the runtimeMetadata field of the Operation associated with a RunPipeline execution.

Fields
compute_engine

ComputeEngine

Execution information specific to Google Compute Engine.

ServiceAccount

A Google Cloud Service Account.

Fields
email

string

Email address of the service account. Defaults to default, which uses the compute service account associated with the project.

scopes[]

string

List of scopes to be enabled for this service account on the VM. The following scopes are automatically included:

SetOperationStatusRequest

Request to set operation status. Should only be used by VMs created by the Pipelines Service and not by end users.

Fields
operation_id

string

timestamp_events[]

TimestampEvent

error_code

Code

error_message

string

validation_token

uint64

TimestampEvent

Stores the list of events and times they occured for major events in job execution.

Fields
description

string

String indicating the type of event

timestamp

Timestamp

The time this event occured.

¿Te ha resultado útil esta página? Enviar comentarios:

Enviar comentarios sobre...