Class PipelineJob (1.38.0)

PipelineJob(
    display_name: str,
    template_path: str,
    job_id: typing.Optional[str] = None,
    pipeline_root: typing.Optional[str] = None,
    parameter_values: typing.Optional[typing.Dict[str, typing.Any]] = None,
    input_artifacts: typing.Optional[typing.Dict[str, str]] = None,
    enable_caching: typing.Optional[bool] = None,
    encryption_spec_key_name: typing.Optional[str] = None,
    labels: typing.Optional[typing.Dict[str, str]] = None,
    credentials: typing.Optional[google.auth.credentials.Credentials] = None,
    project: typing.Optional[str] = None,
    location: typing.Optional[str] = None,
    failure_policy: typing.Optional[str] = None,
)

Retrieves a PipelineJob resource and instantiates its representation.

Parameters

NameDescription
display_name str

Required. The user-defined name of this Pipeline.

template_path str

Required. The path of PipelineJob or PipelineSpec JSON or YAML file. It can be a local path, a Google Cloud Storage URI (e.g. "gs://project.name"), an Artifact Registry URI (e.g. "https://us-central1-kfp.pkg.dev/proj/repo/pack/latest"), or an HTTPS URI.

job_id str

Optional. The unique ID of the job run. If not specified, pipeline name + timestamp will be used.

pipeline_root str

Optional. The root of the pipeline outputs. If not set, the staging bucket set in aiplatform.init will be used. If that's not set a pipeline-specific artifacts bucket will be used.

parameter_values Dict[str, Any]

Optional. The mapping from runtime parameter names to its values that control the pipeline run.

input_artifacts Dict[str, str]

Optional. The mapping from the runtime parameter name for this artifact to its resource id. For example: "vertex_model":"456". Note: full resource name ("projects/123/locations/us-central1/metadataStores/default/artifacts/456") cannot be used.

enable_caching bool

Optional. Whether to turn on caching for the run. If this is not set, defaults to the compile time settings, which are True for all tasks by default, while users may specify different caching options for individual tasks. If this is set, the setting applies to all tasks in the pipeline. Overrides the compile time settings.

encryption_spec_key_name str

Optional. The Cloud KMS resource identifier of the customer managed encryption key used to protect the job. Has the form: projects/my-project/locations/my-region/keyRings/my-kr/cryptoKeys/my-key. The key needs to be in the same region as where the compute resource is created. If this is set, then all resources created by the PipelineJob will be encrypted with the provided encryption key. Overrides encryption_spec_key_name set in aiplatform.init.

labels Dict[str, str]

Optional. The user defined metadata to organize PipelineJob.

credentials auth_credentials.Credentials

Optional. Custom credentials to use to create this PipelineJob. Overrides credentials set in aiplatform.init.

project str

Optional. The project that you want to run this PipelineJob in. If not set, the project set in aiplatform.init will be used.

location str

Optional. Location to create PipelineJob. If not set, location set in aiplatform.init will be used.

failure_policy str

Optional. The failure policy - "slow" or "fast". Currently, the default of a pipeline is that the pipeline will continue to run until no more tasks can be executed, also known as PIPELINE_FAILURE_POLICY_FAIL_SLOW (corresponds to "slow"). However, if a pipeline is set to PIPELINE_FAILURE_POLICY_FAIL_FAST (corresponds to "fast"), it will stop scheduling any new tasks when a task has failed. Any scheduled tasks will continue to completion.

Properties

create_time

Time this resource was created.

display_name

Display name of this resource.

encryption_spec

Customer-managed encryption key options for this Vertex AI resource.

If this is set, then all resources created by this Vertex AI resource will be encrypted with the provided encryption key.

gca_resource

The underlying resource proto representation.

has_failed

Returns True if pipeline has failed.

False otherwise.

labels

User-defined labels containing metadata about this resource.

Read more about labels at https://goo.gl/xmQnxf

name

Name of this resource.

resource_name

Full qualified resource name.

state

Current pipeline state.

update_time

Time this resource was last updated.

Methods

__init_subclass__

__init_subclass__(
    *,
    experiment_loggable_schemas: typing.Tuple[
        google.cloud.aiplatform.metadata.experiment_resources._ExperimentLoggableSchema
    ],
    **kwargs
)

Register the metadata_schema for the subclass so Experiment can use it to retrieve the associated types.

usage:

class PipelineJob(..., experiment_loggable_schemas= (_ExperimentLoggableSchema(title='system.PipelineRun'), )

cancel

cancel() -> None

Starts asynchronous cancellation on the PipelineJob. The server makes a best effort to cancel the job, but success is not guaranteed. On successful cancellation, the PipelineJob is not deleted; instead it becomes a job with state set to CANCELLED.

clone

clone(
    display_name: typing.Optional[str] = None,
    job_id: typing.Optional[str] = None,
    pipeline_root: typing.Optional[str] = None,
    parameter_values: typing.Optional[typing.Dict[str, typing.Any]] = None,
    input_artifacts: typing.Optional[typing.Dict[str, str]] = None,
    enable_caching: typing.Optional[bool] = None,
    encryption_spec_key_name: typing.Optional[str] = None,
    labels: typing.Optional[typing.Dict[str, str]] = None,
    credentials: typing.Optional[google.auth.credentials.Credentials] = None,
    project: typing.Optional[str] = None,
    location: typing.Optional[str] = None,
) -> google.cloud.aiplatform.pipeline_jobs.PipelineJob

Returns a new PipelineJob object with the same settings as the original one.

Parameters
NameDescription
display_name str

Optional. The user-defined name of this cloned Pipeline. If not specified, original pipeline display name will be used.

job_id str

Optional. The unique ID of the job run. If not specified, "cloned" + pipeline name + timestamp will be used.

pipeline_root str

Optional. The root of the pipeline outputs. Default to be the same staging bucket as original pipeline.

parameter_values Dict[str, Any]

Optional. The mapping from runtime parameter names to its values that control the pipeline run. Defaults to be the same values as original PipelineJob.

input_artifacts Dict[str, str]

Optional. The mapping from the runtime parameter name for this artifact to its resource id. Defaults to be the same values as original PipelineJob. For example: "vertex_model":"456". Note: full resource name ("projects/123/locations/us-central1/metadataStores/default/artifacts/456") cannot be used.

enable_caching bool

Optional. Whether to turn on caching for the run. If this is not set, defaults to be the same as original pipeline. If this is set, the setting applies to all tasks in the pipeline.

encryption_spec_key_name str

Optional. The Cloud KMS resource identifier of the customer managed encryption key used to protect the job. Has the form: projects/my-project/locations/my-region/keyRings/my-kr/cryptoKeys/my-key. The key needs to be in the same region as where the compute resource is created. If this is set, then all resources created by the PipelineJob will be encrypted with the provided encryption key. If not specified, encryption_spec of original PipelineJob will be used.

labels Dict[str, str]

Optional. The user defined metadata to organize PipelineJob.

credentials auth_credentials.Credentials

Optional. Custom credentials to use to create this PipelineJob. Overrides credentials set in aiplatform.init.

project str

Optional. The project that you want to run this PipelineJob in. If not set, the project set in original PipelineJob will be used.

location str

Optional. Location to create PipelineJob. If not set, location set in original PipelineJob will be used.

Exceptions
TypeDescription
ValueErrorIf job_id or labels have incorrect format.

create_schedule

create_schedule(
    cron: str,
    display_name: str,
    start_time: typing.Optional[str] = None,
    end_time: typing.Optional[str] = None,
    allow_queueing: bool = False,
    max_run_count: typing.Optional[int] = None,
    max_concurrent_run_count: int = 1,
    service_account: typing.Optional[str] = None,
    network: typing.Optional[str] = None,
    create_request_timeout: typing.Optional[float] = None,
) -> google.cloud.aiplatform.pipeline_job_schedules.PipelineJobSchedule

Creates a PipelineJobSchedule directly from a PipelineJob.

Example Usage:

pipeline_job = aiplatform.PipelineJob( display_name='job_display_name', template_path='your_pipeline.yaml', ) pipeline_job.run() pipeline_job_schedule = pipeline_job.create_schedule( cron='* * * * *', display_name='schedule_display_name', )

Parameters
NameDescription
cron str

Required. Time specification (cron schedule expression) to launch scheduled runs. To explicitly set a timezone to the cron tab, apply a prefix: "CRON_TZ=${IANA_TIME_ZONE}" or "TZ=${IANA_TIME_ZONE}". The ${IANA_TIME_ZONE} may only be a valid string from IANA time zone database. For example, "CRON_TZ=America/New_York 1 * * * *", or "TZ=America/New_York 1 * * * *".

display_name str

Required. The user-defined name of this PipelineJobSchedule.

start_time str

Optional. Timestamp after which the first run can be scheduled. If unspecified, it defaults to the schedule creation timestamp.

end_time str

Optional. Timestamp after which no more runs will be scheduled. If unspecified, then runs will be scheduled indefinitely.

allow_queueing bool

Optional. Whether new scheduled runs can be queued when max_concurrent_runs limit is reached.

max_run_count int

Optional. Maximum run count of the schedule. If specified, The schedule will be completed when either started_run_count >= max_run_count or when end_time is reached. Must be positive and <= 2^63-1.

max_concurrent_run_count int

Optional. Maximum number of runs that can be started concurrently for this PipelineJobSchedule.

service_account str

Optional. Specifies the service account for workload run-as account. Users submitting jobs must have act-as permission on this run-as account.

network str

Optional. The full name of the Compute Engine network to which the job should be peered. For example, projects/12345/global/networks/myVPC. Private services access must already be configured for the network. If left unspecified, the network set in aiplatform.init will be used. Otherwise, the job is not peered with any network.

create_request_timeout float

Optional. The timeout for the create request in seconds.

delete

delete(sync: bool = True) -> None

Deletes this Vertex AI resource. WARNING: This deletion is permanent.

Parameter
NameDescription
sync bool

Whether to execute this deletion synchronously. If False, this method will be executed in concurrent Future and any downstream object will be immediately returned and synced when the Future has completed.

done

done() -> bool

Helper method that return True is PipelineJob is done. False otherwise.

from_pipeline_func

from_pipeline_func(
    pipeline_func: typing.Callable,
    parameter_values: typing.Optional[typing.Dict[str, typing.Any]] = None,
    input_artifacts: typing.Optional[typing.Dict[str, str]] = None,
    output_artifacts_gcs_dir: typing.Optional[str] = None,
    enable_caching: typing.Optional[bool] = None,
    context_name: typing.Optional[str] = "pipeline",
    display_name: typing.Optional[str] = None,
    labels: typing.Optional[typing.Dict[str, str]] = None,
    job_id: typing.Optional[str] = None,
    project: typing.Optional[str] = None,
    location: typing.Optional[str] = None,
    credentials: typing.Optional[google.auth.credentials.Credentials] = None,
    encryption_spec_key_name: typing.Optional[str] = None,
) -> google.cloud.aiplatform.pipeline_jobs.PipelineJob

Creates PipelineJob by compiling a pipeline function.

Parameters
NameDescription
pipeline_func Callable

Required. A pipeline function to compile. A pipeline function creates instances of components and connects component inputs to outputs.

parameter_values Dict[str, Any]

Optional. The mapping from runtime parameter names to its values that control the pipeline run.

input_artifacts Dict[str, str]

Optional. The mapping from the runtime parameter name for this artifact to its resource id. For example: "vertex_model":"456". Note: full resource name ("projects/123/locations/us-central1/metadataStores/default/artifacts/456") cannot be used.

output_artifacts_gcs_dir str

Optional. The GCS location of the pipeline outputs. A GCS bucket for artifacts will be created if not specified.

enable_caching bool

Optional. Whether to turn on caching for the run. If this is not set, defaults to the compile time settings, which are True for all tasks by default, while users may specify different caching options for individual tasks. If this is set, the setting applies to all tasks in the pipeline. Overrides the compile time settings.

context_name str

Optional. The name of metadata context. Used for cached execution reuse.

display_name str

Optional. The user-defined name of this Pipeline.

labels Dict[str, str]

Optional. The user defined metadata to organize PipelineJob.

job_id str

Optional. The unique ID of the job run. If not specified, pipeline name + timestamp will be used.

project str

Optional. The project that you want to run this PipelineJob in. If not set, the project set in aiplatform.init will be used.

location str

Optional. Location to create PipelineJob. If not set, location set in aiplatform.init will be used.

credentials auth_credentials.Credentials

Optional. Custom credentials to use to create this PipelineJob. Overrides credentials set in aiplatform.init.

encryption_spec_key_name str

Optional. The Cloud KMS resource identifier of the customer managed encryption key used to protect the job. Has the form: projects/my-project/locations/my-region/keyRings/my-kr/cryptoKeys/my-key. The key needs to be in the same region as where the compute resource is created. If this is set, then all resources created by the PipelineJob will be encrypted with the provided encryption key. Overrides encryption_spec_key_name set in aiplatform.init.

Exceptions
TypeDescription
ValueErrorIf job_id or labels have incorrect format.

get

get(
    resource_name: str,
    project: typing.Optional[str] = None,
    location: typing.Optional[str] = None,
    credentials: typing.Optional[google.auth.credentials.Credentials] = None,
) -> google.cloud.aiplatform.pipeline_jobs.PipelineJob

Get a Vertex AI Pipeline Job for the given resource_name.

Parameters
NameDescription
resource_name str

Required. A fully-qualified resource name or ID.

project str

Optional. Project to retrieve dataset from. If not set, project set in aiplatform.init will be used.

location str

Optional. Location to retrieve dataset from. If not set, location set in aiplatform.init will be used.

credentials auth_credentials.Credentials

Optional. Custom credentials to use to upload this model. Overrides credentials set in aiplatform.init.

get_associated_experiment

get_associated_experiment() -> (
    typing.Optional[google.cloud.aiplatform.metadata.experiment_resources.Experiment]
)

Gets the aiplatform.Experiment associated with this PipelineJob, or None if this PipelineJob is not associated with an experiment.

list

list(
    filter: typing.Optional[str] = None,
    order_by: typing.Optional[str] = None,
    enable_simple_view: bool = False,
    project: typing.Optional[str] = None,
    location: typing.Optional[str] = None,
    credentials: typing.Optional[google.auth.credentials.Credentials] = None,
) -> typing.List[google.cloud.aiplatform.pipeline_jobs.PipelineJob]

List all instances of this PipelineJob resource.

Example Usage:

aiplatform.PipelineJob.list( filter='display_name="experiment_a27"', order_by='create_time desc' )

Parameters
NameDescription
filter str

Optional. An expression for filtering the results of the request. For field names both snake_case and camelCase are supported.

order_by str

Optional. A comma-separated list of fields to order by, sorted in ascending order. Use "desc" after a field name for descending. Supported fields: display_name, create_time, update_time

enable_simple_view bool

Optional. Whether to pass the read_mask parameter to the list call. Defaults to False if not provided. This will improve the performance of calling list(). However, the returned PipelineJob list will not include all fields for each PipelineJob. Setting this to True will exclude the following fields in your response: runtime_config, service_account, network, and some subfields of pipeline_spec and job_detail. The following fields will be included in each PipelineJob resource in your response: state, display_name, pipeline_spec.pipeline_info, create_time, start_time, end_time, update_time, labels, template_uri, template_metadata.version, job_detail.pipeline_run_context, job_detail.pipeline_context.

project str

Optional. Project to retrieve list from. If not set, project set in aiplatform.init will be used.

location str

Optional. Location to retrieve list from. If not set, location set in aiplatform.init will be used.

credentials auth_credentials.Credentials

Optional. Custom credentials to use to retrieve list. Overrides credentials set in aiplatform.init.

run

run(
    service_account: typing.Optional[str] = None,
    network: typing.Optional[str] = None,
    sync: typing.Optional[bool] = True,
    create_request_timeout: typing.Optional[float] = None,
) -> None

Run this configured PipelineJob and monitor the job until completion.

Parameters
NameDescription
service_account str

Optional. Specifies the service account for workload run-as account. Users submitting jobs must have act-as permission on this run-as account.

network str

Optional. The full name of the Compute Engine network to which the job should be peered. For example, projects/12345/global/networks/myVPC. Private services access must already be configured for the network. If left unspecified, the network set in aiplatform.init will be used. Otherwise, the job is not peered with any network.

sync bool

Optional. Whether to execute this method synchronously. If False, this method will unblock and it will be executed in a concurrent Future.

create_request_timeout float

Optional. The timeout for the create request in seconds.

submit

submit(
    service_account: typing.Optional[str] = None,
    network: typing.Optional[str] = None,
    create_request_timeout: typing.Optional[float] = None,
    *,
    experiment: typing.Optional[
        typing.Union[
            google.cloud.aiplatform.metadata.experiment_resources.Experiment, str
        ]
    ] = None
) -> None

Run this configured PipelineJob.

Parameters
NameDescription
service_account str

Optional. Specifies the service account for workload run-as account. Users submitting jobs must have act-as permission on this run-as account.

network str

Optional. The full name of the Compute Engine network to which the job should be peered. For example, projects/12345/global/networks/myVPC. Private services access must already be configured for the network. If left unspecified, the network set in aiplatform.init will be used. Otherwise, the job is not peered with any network.

create_request_timeout float

Optional. The timeout for the create request in seconds.

experiment Union[str, experiments_resource.Experiment]

Optional. The Vertex AI experiment name or instance to associate to this PipelineJob. Metrics produced by the PipelineJob as system.Metric Artifacts will be associated as metrics to the current Experiment Run. Pipeline parameters will be associated as parameters to the current Experiment Run.

to_dict

to_dict() -> typing.Dict[str, typing.Any]

Returns the resource proto as a dictionary.

wait

wait()

Wait for this PipelineJob to complete.

wait_for_resource_creation

wait_for_resource_creation() -> None

Waits until resource has been created.