REST Resource: pipelines

Resource: Pipeline

The pipeline object. Represents a transformation from a set of input parameters to a set of output parameters. The transformation is defined as a docker image and command to run within that image. Each pipeline is run on a Google Compute Engine VM. A pipeline can be created with the create method and then later run with the run method, or a pipeline can be defined and run all at once with the run method.

JSON representation
{
  "projectId": string,
  "name": string,
  "description": string,
  "inputParameters": [
    {
      object (PipelineParameter)
    }
  ],
  "outputParameters": [
    {
      object (PipelineParameter)
    }
  ],
  "resources": {
    object (PipelineResources)
  },
  "pipelineId": string,
  "docker": {
    object (DockerExecutor)
  }
}
Fields
projectId

string

Required. The project in which to create the pipeline. The caller must have WRITE access.

name

string

Required. A user specified pipeline name that does not have to be unique. This name can be used for filtering Pipelines in pipelines.list.

description

string

User-specified description.

inputParameters[]

object (PipelineParameter)

Input parameters of the pipeline.

outputParameters[]

object (PipelineParameter)

Output parameters of the pipeline.

resources

object (PipelineResources)

Required. Specifies resource requirements for the pipeline run. Required fields:

* minimumCpuCores

* minimumRamGb

pipelineId

string

Unique pipeline id that is generated by the service when pipelines.create is called. Cannot be specified in the Pipeline used in the CreatePipelineRequest, and will be populated in the response to pipelines.create and all subsequent Get and List calls. Indicates that the service has registered this pipeline.

docker

object (DockerExecutor)

Specifies the docker run information.

PipelineParameter

Parameters facilitate setting and delivering data into the pipeline's execution environment. They are defined at create time, with optional defaults, and can be overridden at run time.

If localCopy is unset, then the parameter specifies a string that is passed as-is into the pipeline, as the value of the environment variable with the given name. A default value can be optionally specified at create time. The default can be overridden at run time using the inputs map. If no default is given, a value must be supplied at runtime.

If localCopy is defined, then the parameter specifies a data source or sink, both in Google Cloud Storage and on the Docker container where the pipeline computation is run. The service account associated with the Pipeline (by default the project's Compute Engine service account) must have access to the Google Cloud Storage paths.

At run time, the Google Cloud Storage paths can be overridden if a default was provided at create time, or must be set otherwise. The pipeline runner should add a key/value pair to either the inputs or outputs map. The indicated data copies will be carried out before/after pipeline execution, just as if the corresponding arguments were provided to gsutil cp.

For example: Given the following PipelineParameter, specified in the inputParameters list:

{name: "input_file", localCopy: {path: "file.txt", disk: "pd1"}}

where disk is defined in the PipelineResources object as:

{name: "pd1", mountPoint: "/mnt/disk/"}

We create a disk named pd1, mount it on the host VM, and map /mnt/pd1 to /mnt/disk in the docker container. At runtime, an entry for input_file would be required in the inputs map, such as:

  inputs["input_file"] = "gs://my-bucket/bar.txt"

This would generate the following gsutil call:

  gsutil cp gs://my-bucket/bar.txt /mnt/pd1/file.txt

The file /mnt/pd1/file.txt maps to /mnt/disk/file.txt in the Docker container. Acceptable paths are:

Google Cloud storage pathLocal path
filefile
globdirectory

For outputs, the direction of the copy is reversed:

  gsutil cp /mnt/disk/file.txt gs://my-bucket/bar.txt

Acceptable paths are:

Local pathGoogle Cloud Storage path
filefile
file directory - directory must already exist
glob directory - directory will be created if it doesn't exist

One restriction due to docker limitations, is that for outputs that are found on the boot disk, the local path cannot be a glob and must be a file.

JSON representation
{
  "name": string,
  "description": string,
  "defaultValue": string,
  "localCopy": {
    object (LocalCopy)
  }
}
Fields
name

string

Required. Name of the parameter - the pipeline runner uses this string as the key to the input and output maps in pipelines.run.

description

string

Human-readable description.

defaultValue

string

The default value for this parameter. Can be overridden at runtime. If localCopy is present, then this must be a Google Cloud Storage path beginning with gs://.

localCopy

object (LocalCopy)

If present, this parameter is marked for copying to and from the VM. LocalCopy indicates where on the VM the file should be. The value given to this parameter (either at runtime or using defaultValue) must be the remote path where the file should be.

LocalCopy

LocalCopy defines how a remote file should be copied to and from the VM.

JSON representation
{
  "path": string,
  "disk": string
}
Fields
path

string

Required. The path within the user's docker container where this input should be localized to and from, relative to the specified disk's mount point. For example: file.txt,

disk

string

Required. The name of the disk where this parameter is located. Can be the name of one of the disks specified in the Resources field, or "boot", which represents the Docker instance's boot disk and has a mount point of /.

DockerExecutor

The Docker execuctor specification.

JSON representation
{
  "imageName": string,
  "cmd": string
}
Fields
imageName

string

Required. Image name from either Docker Hub or Google Container Registry. Users that run pipelines must have READ access to the image.

cmd

string

Required. The command or newline delimited script to run. The command string will be executed within a bash shell.

If the command exits with a non-zero exit code, output parameter de-localization will be skipped and the pipeline operation's error field will be populated.

Maximum command string length is 16384.

PipelineResources

The system resources for the pipeline run.

JSON representation
{
  "minimumCpuCores": number,
  "preemptible": boolean,
  "minimumRamGb": number,
  "disks": [
    {
      object (Disk)
    }
  ],
  "zones": [
    string
  ],
  "bootDiskSizeGb": number,
  "noAddress": boolean,
  "acceleratorType": string,
  "acceleratorCount": string
}
Fields
minimumCpuCores

number

The minimum number of cores to use. Defaults to 1.

preemptible

boolean

Whether to use preemptible VMs. Defaults to false. In order to use this, must be true for both create time and run time. Cannot be true at run time if false at create time.

minimumRamGb

number

The minimum amount of RAM to use. Defaults to 3.75 (GB)

disks[]

object (Disk)

Disks to attach.

zones[]

string

List of Google Compute Engine availability zones to which resource creation will restricted. If empty, any zone may be chosen.

bootDiskSizeGb

number

The size of the boot disk. Defaults to 10 (GB).

noAddress

boolean

Whether to assign an external IP to the instance. This is an experimental feature that may go away. Defaults to false. Corresponds to --noAddress flag for gcloud compute instances create. In order to use this, must be true for both create time and run time. Cannot be true at run time if false at create time. If you need to ssh into a private IP VM for debugging, you can ssh to a public VM and then ssh into the private VM's Internal IP. If noAddress is set, this pipeline run may only load docker images from Google Container Registry and not Docker Hub. Before using this, you must configure access to Google services from internal IPs.

acceleratorType

string

Optional. The Compute Engine defined accelerator type. By specifying this parameter, you will download and install the following third-party software onto your managed Compute Engine instances: NVIDIA® Tesla® drivers and NVIDIA® CUDA toolkit. Please see https://cloud.google.com/compute/docs/gpus/ for a list of available accelerator types.

acceleratorCount

string (int64 format)

Optional. The number of accelerators of the specified type to attach. By specifying this parameter, you will download and install the following third-party software onto your managed Compute Engine instances: NVIDIA® Tesla® drivers and NVIDIA® CUDA toolkit.

Disk

A Google Compute Engine disk resource specification.

JSON representation
{
  "name": string,
  "type": enum (Type),
  "sizeGb": number,
  "source": string,
  "autoDelete": boolean,
  "readOnly": boolean,
  "mountPoint": string
}
Fields
name

string

Required. The name of the disk that can be used in the pipeline parameters. Must be 1 - 63 characters. The name "boot" is reserved for system use.

type

enum (Type)

Required. The type of the disk to create.

sizeGb

number

The size of the disk. Defaults to 500 (GB). This field is not applicable for local SSD.

source

string

The full or partial URL of the persistent disk to attach. See https://cloud.google.com/compute/docs/reference/latest/instances#resource and https://cloud.google.com/compute/docs/disks/persistent-disks#snapshots for more details.

autoDelete

boolean

Deprecated. Disks created by the Pipelines API will be deleted at the end of the pipeline run, regardless of what this field is set to.

readOnly

boolean

Specifies how a sourced-base persistent disk will be mounted. See https://cloud.google.com/compute/docs/disks/persistent-disks#use_multi_instances for more details. Can only be set at create time.

mountPoint

string

Required at create time and cannot be overridden at run time. Specifies the path in the docker container where files on this disk should be located. For example, if mountPoint is /mnt/disk, and the parameter has localPath inputs/file.txt, the docker container can access the data at /mnt/disk/inputs/file.txt.

Type

The types of disks that may be attached to VMs.

Enums
TYPE_UNSPECIFIED Default disk type. Use one of the other options below.
PERSISTENT_HDD Specifies a Google Compute Engine persistent hard disk. See https://cloud.google.com/compute/docs/disks/#pdspecs for details.
PERSISTENT_SSD Specifies a Google Compute Engine persistent solid-state disk. See https://cloud.google.com/compute/docs/disks/#pdspecs for details.
LOCAL_SSD Specifies a Google Compute Engine local SSD. See https://cloud.google.com/compute/docs/disks/local-ssd for details.

Methods

create
(deprecated)

Creates a pipeline that can be run later.

delete
(deprecated)

Deletes a pipeline based on ID.

get
(deprecated)

Retrieves a pipeline based on ID.

getControllerConfig
(deprecated)

Gets controller configuration information.

list
(deprecated)

Lists pipelines.

run
(deprecated)

Runs a pipeline.

setOperationStatus
(deprecated)

Sets status of a given operation.
このページは役立ちましたか?評価をお願いいたします。

フィードバックを送信...