Mantieni tutto organizzato con le raccolte
Salva e classifica i contenuti in base alle tue preferenze.
Scrivere un componente per mostrare un Google Cloud link alla console
È normale che, quando esegui un componente, tu voglia visualizzare non solo il link al job del componente in fase di avvio, ma anche il link alle risorse cloud sottostanti, come i job di previsione batch di Vertex o i job Dataflow.
Il proto gcp_resource è un parametro speciale che puoi utilizzare nel componente per consentire alla console Google Cloud di fornire una visualizzazione personalizzata dei log e dello stato della risorsa nella console Vertex AI Pipelines.
Output del parametro gcp_resource
Utilizzo di un componente basato su container
Innanzitutto, devi definire il parametro gcp_resource nel componente come mostrato nel seguente file di esempio component.py:
# Copyright 2023 The Kubeflow Authors. All Rights Reserved.## Licensed under the Apache License, Version 2.0 (the "License");# you may not use this file except in compliance with the License.# You may obtain a copy of the License at## http://www.apache.org/licenses/LICENSE-2.0## Unless required by applicable law or agreed to in writing, software# distributed under the License is distributed on an "AS IS" BASIS,# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.# See the License for the specific language governing permissions and# limitations under the License.fromtypingimportListfromgoogle_cloud_pipeline_componentsimport_imagefromgoogle_cloud_pipeline_componentsimport_placeholdersfromkfp.dslimportcontainer_componentfromkfp.dslimportContainerSpecfromkfp.dslimportOutputPath@container_componentdefdataflow_python(python_module_path:str,temp_location:str,gcp_resources:OutputPath(str),location:str='us-central1',requirements_file_path:str='',args:List[str]=[],project:str=_placeholders.PROJECT_ID_PLACEHOLDER,):# fmt: off"""Launch a self-executing Beam Python file on Google Cloud using the Dataflow Runner. Args: location: Location of the Dataflow job. If not set, defaults to `'us-central1'`. python_module_path: The GCS path to the Python file to run. temp_location: A GCS path for Dataflow to stage temporary job files created during the execution of the pipeline. requirements_file_path: The GCS path to the pip requirements file. args: The list of args to pass to the Python file. Can include additional parameters for the Dataflow Runner. project: Project to create the Dataflow job. Defaults to the project in which the PipelineJob is run. Returns: gcp_resources: Serialized gcp_resources proto tracking the Dataflow job. For more details, see https://github.com/kubeflow/pipelines/blob/master/components/google-cloud/google_cloud_pipeline_components/proto/README.md. """# fmt: onreturnContainerSpec(image=_image.GCPC_IMAGE_TAG,command=['python3','-u','-m','google_cloud_pipeline_components.container.v1.dataflow.dataflow_launcher',],args=['--project',project,'--location',location,'--python_module_path',python_module_path,'--temp_location',temp_location,'--requirements_file_path',requirements_file_path,'--args',args,'--gcp_resources',gcp_resources,],)
Successivamente, all'interno del container, installa il pacchetto Google Cloud Pipeline Components:
Puoi impostare resource_type come stringa arbitraria, ma solo i seguenti tipi hanno link nella console Google Cloud :
BatchPredictionJob
BigQueryJob
CustomJob
DataflowJob
HyperparameterTuningJob
Scrivere un componente per annullare le risorse sottostanti
Quando un job della pipeline viene annullato, il comportamento predefinito prevede che le risorse Google Cloud sottostanti continuino a essere eseguite. Non vengono annullati automaticamente. Per modificare questo comportamento, devi collegare un gestore SIGTERM al job della pipeline. Un buon momento per farlo è subito prima di un ciclo di polling per un job che potrebbe essere eseguito a lungo.
L'annullamento è stato implementato in diversi Google Cloud componenti della pipeline, tra cui:
Job di previsioni in batch
Job BigQuery ML
Job personalizzato
Job batch Dataproc Serverless
Job di ottimizzazione degli iperparametri
Per maggiori informazioni, incluso codice campione che mostra come collegare un gestore SIGTERM, consulta i seguenti link di GitHub:
Quando implementi il gestore SIGTERM, tieni presente quanto segue:
La propagazione dell'annullamento funziona solo dopo che il componente è in esecuzione da alcuni minuti. Ciò è in genere dovuto a attività di avvio in background che devono essere elaborate prima che vengano chiamati i gestori di segnali Python.
L'annullamento potrebbe non essere stato implementato per alcune Google Cloud risorse. Ad esempio, la creazione o l'eliminazione di un endpoint o di un modello Vertex AI potrebbe creare un'operazione a lunga esecuzione che accetta una richiesta di annullamento tramite la sua API REST, ma non implementa l'operazione di annullamento stessa.
[[["Facile da capire","easyToUnderstand","thumb-up"],["Il problema è stato risolto","solvedMyProblem","thumb-up"],["Altra","otherUp","thumb-up"]],[["Difficile da capire","hardToUnderstand","thumb-down"],["Informazioni o codice di esempio errati","incorrectInformationOrSampleCode","thumb-down"],["Mancano le informazioni o gli esempi di cui ho bisogno","missingTheInformationSamplesINeed","thumb-down"],["Problema di traduzione","translationIssue","thumb-down"],["Altra","otherDown","thumb-down"]],["Ultimo aggiornamento 2025-09-04 UTC."],[],[],null,["# Build your own pipeline components\n\n| To learn more,\n| run the \"Custom training workflow with prebuilt Pipeline Components and custom components\" notebook in one of the following\n| environments:\n|\n| [Open in Colab](https://colab.research.google.com/github/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/pipelines/google_cloud_pipeline_components_model_train_upload_deploy.ipynb)\n|\n|\n| \\|\n|\n| [Open in Colab Enterprise](https://console.cloud.google.com/vertex-ai/colab/import/https%3A%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fvertex-ai-samples%2Fmain%2Fnotebooks%2Fofficial%2Fpipelines%2Fgoogle_cloud_pipeline_components_model_train_upload_deploy.ipynb)\n|\n|\n| \\|\n|\n| [Open\n| in Vertex AI Workbench](https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https%3A%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fvertex-ai-samples%2Fmain%2Fnotebooks%2Fofficial%2Fpipelines%2Fgoogle_cloud_pipeline_components_model_train_upload_deploy.ipynb)\n|\n|\n| \\|\n|\n| [View on GitHub](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/pipelines/google_cloud_pipeline_components_model_train_upload_deploy.ipynb)\n\nWrite a component to show a Google Cloud console link\n-----------------------------------------------------\n\nIt's common that when running a component, you want to not only see the link to the component job being launched, but also the link to the underlying cloud resources, such as the Vertex batch prediction jobs or dataflow jobs.\n\nThe [`gcp_resource` proto](https://github.com/kubeflow/pipelines/tree/master/components/google-cloud/google_cloud_pipeline_components/proto) is a special parameter that you can use in your component to enable the Google Cloud console to provide a customized view of the resource's logs and status in the Vertex AI Pipelines console.\n\n### Output the `gcp_resource` parameter\n\n#### Using a container-based component\n\nFirst, you'll need to define the `gcp_resource` parameter in your component as shown in the following example `component.py` file: \n\n### Python\n\nTo learn how to install or update the Vertex AI SDK for Python, see [Install the Vertex AI SDK for Python](/vertex-ai/docs/start/use-vertex-ai-python-sdk).\n\nFor more information, see the\n[Python API reference documentation](/python/docs/reference/aiplatform/latest).\n\n # Copyright 2023 The Kubeflow Authors. All Rights Reserved.\n #\n # Licensed under the Apache License, Version 2.0 (the \"License\");\n # you may not use this file except in compliance with the License.\n # You may obtain a copy of the License at\n #\n # http://www.apache.org/licenses/LICENSE-2.0\n #\n # Unless required by applicable law or agreed to in writing, software\n # distributed under the License is distributed on an \"AS IS\" BASIS,\n # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n # See the License for the specific language governing permissions and\n # limitations under the License.\n from typing import List\n\n from google_cloud_pipeline_components import _image\n from google_cloud_pipeline_components import _placeholders\n from kfp.dsl import container_component\n from kfp.dsl import ContainerSpec\n from kfp.dsl import OutputPath\n\n\n @container_component\n def dataflow_python(\n python_module_path: str,\n temp_location: str,\n gcp_resources: OutputPath(str),\n location: str = 'us-central1',\n requirements_file_path: str = '',\n args: List[str] = [],\n project: str = _placeholders.PROJECT_ID_PLACEHOLDER,\n ):\n # fmt: off\n \"\"\"Launch a self-executing Beam Python file on Google Cloud using the\n Dataflow Runner.\n\n Args:\n location: Location of the Dataflow job. If not set, defaults to `'us-central1'`.\n python_module_path: The GCS path to the Python file to run.\n temp_location: A GCS path for Dataflow to stage temporary job files created during the execution of the pipeline.\n requirements_file_path: The GCS path to the pip requirements file.\n args: The list of args to pass to the Python file. Can include additional parameters for the Dataflow Runner.\n project: Project to create the Dataflow job. Defaults to the project in which the PipelineJob is run.\n\n Returns:\n gcp_resources: Serialized gcp_resources proto tracking the Dataflow job. For more details, see https://github.com/kubeflow/pipelines/blob/master/components/google-cloud/google_cloud_pipeline_components/proto/README.md.\n \"\"\"\n # fmt: on\n return ContainerSpec(\n image=_image.GCPC_IMAGE_TAG,\n command=[\n 'python3',\n '-u',\n '-m',\n 'google_cloud_pipeline_components.container.v1.dataflow.dataflow_launcher',\n ],\n args=[\n '--project',\n project,\n '--location',\n location,\n '--python_module_path',\n python_module_path,\n '--temp_location',\n temp_location,\n '--requirements_file_path',\n requirements_file_path,\n '--args',\n args,\n '--gcp_resources',\n gcp_resources,\n ],\n )\n\n\u003cbr /\u003e\n\nNext, inside the container, install the Google Cloud Pipeline Components package: \n\n pip install --upgrade google-cloud-pipeline-components\n\nNext, in the Python code, define the resource as a `gcp_resource` parameter: \n\n### Python\n\nTo learn how to install or update the Vertex AI SDK for Python, see [Install the Vertex AI SDK for Python](/vertex-ai/docs/start/use-vertex-ai-python-sdk).\n\nFor more information, see the\n[Python API reference documentation](/python/docs/reference/aiplatform/latest).\n\n from google_cloud_pipeline_components.proto.gcp_resources_pb2 import GcpResources\n from google.protobuf.json_format import MessageToJson\n\n dataflow_resources = GcpResources()\n dr = dataflow_resources.resources.add()\n dr.resource_type='DataflowJob'\n dr.resource_uri='https://dataflow.googleapis.com/v1b3/projects/[your-project]/locations/us-east1/jobs/[dataflow-job-id]'\n\n with open(gcp_resources, 'w') as f:\n f.write(MessageToJson(dataflow_resources))\n\n\u003cbr /\u003e\n\n#### Using a Python component\n\nAlternatively, you can return the `gcp_resources` output parameter as you would any string output parameter: \n\n @dsl.component(\n base_image='python:3.9',\n packages_to_install=['google-cloud-pipeline-components==2.19.0'],\n )\n def launch_dataflow_component(project: str, location:str) -\u003e NamedTuple(\"Outputs\", [(\"gcp_resources\", str)]):\n # Launch the dataflow job\n dataflow_job_id = [dataflow-id]\n dataflow_resources = GcpResources()\n dr = dataflow_resources.resources.add()\n dr.resource_type='DataflowJob'\n dr.resource_uri=f'https://dataflow.googleapis.com/v1b3/projects/{project}/locations/{location}/jobs/{dataflow_job_id}'\n gcp_resources=MessageToJson(dataflow_resources)\n return gcp_resources\n\n#### Supported `resource_type` values\n\nYou can set the `resource_type` to be an arbitrary string, but only the following types have links in the Google Cloud console:\n\n- BatchPredictionJob\n- BigQueryJob\n- CustomJob\n- DataflowJob\n- HyperparameterTuningJob\n\nWrite a component to cancel the underlying resources\n----------------------------------------------------\n\nWhen a pipeline job is canceled, the default behavior is for the underlying Google Cloud resources to keep running. They are not canceled automatically. To change this behavior, you should attach a [SIGTERM](https://docs.python.org/3/library/signal.html#signal.SIGTERM) handler to the pipeline job. A good place to do this is just before a polling loop for a job that could run for a long time.\n\nCancellation has been implemented on several Google Cloud Pipeline Components, including:\n\n- Batch prediction job\n- BigQuery ML job\n- Custom job\n- Dataproc Serverless batch job\n- Hyperparameter tuning job\n\nFor more information, including sample code that shows how to attach a SIGTERM handler, see the following GitHub links:\n\n- \u003chttps://github.com/kubeflow/pipelines/blob/google-cloud-pipeline-components-2.19.0/components/google-cloud/google_cloud_pipeline_components/container/utils/execution_context.py\u003e\n- \u003chttps://github.com/kubeflow/pipelines/blob/google-cloud-pipeline-components-2.19.0/components/google-cloud/google_cloud_pipeline_components/container/v1/gcp_launcher/job_remote_runner.py#L124\u003e\n\nConsider the following when implementing your SIGTERM handler:\n\n- Cancellation propagation works only after the component has been running for a few minutes. This is typically due to background startup tasks that need to be [processed](https://docs.python.org/3/library/signal.html#execution-of-python-signal-handlers) before the Python signal handlers are called.\n- Some Google Cloud resources might not have cancellation implemented. For example, creating or deleting a Vertex AI Endpoint or Model could create a long-running operation that accepts a cancellation request through its REST API, but doesn't implement the cancellation operation itself."]]