使用标签整理资源

本文档介绍了如何使用标签整理批处理资源。

标签是应用于资源的键值对,用于对资源进行分组和描述。批处理作业具有预定义标签(自动应用于资源)和自定义标签(您可以在创建作业时定义和应用)。

借助标签,您可以过滤资源列表和 Cloud Billing 报告的结果。例如,您可以使用标签执行以下操作:

  • 阐明并整理项目的作业列表。

  • 使用标签来描述可运行对象指定的容器或脚本类型,从而区分作业的可运行对象。

  • 通过过滤 Cloud Billing 报告中由批处理作业或特定作业创建的资源,分析费用。

如需详细了解标签,请参阅 Compute Engine 文档中的标签

准备工作

  1. 如果您之前未使用过批处理功能,请参阅开始使用批处理,并完成适用于项目和用户的前提条件,以启用批处理功能。
  2. 如需获得创建作业所需的权限,请让您的管理员为您授予以下 IAM 角色:

    如需详细了解如何授予角色,请参阅管理对项目、文件夹和组织的访问权限

    您也可以通过自定义角色或其他预定义角色来获取所需的权限。

限制

除了 Compute Engine 文档中指定的标签要求之外,向 Batch 作业及其资源应用标签还存在以下限制:

  • 批处理仅支持为使用批处理创建的以下类型的资源添加标签:

  • 考虑到批处理工具自动应用于作业的预定义标签,您可以定义以下数量的自定义标签:

    • 您最多可以定义 63 个自定义标签,以应用于作业及其可运行作业。

    • 您最多可以定义 61 个自定义标签,以应用于为作业创建的每个 GPU、永久性磁盘和虚拟机。

  • 批处理仅支持使用具有唯一名称的自定义标签进行定义。这会导致以下结果:

    • 尝试替换预定义标签会导致错误。

    • 如果定义了重复的自定义标签,系统会覆盖现有自定义标签。

  • Batch 仅支持在创建作业时定义标签。

    • 您无法添加、更新或移除作业和可运行作业的标签。

    • 虽然您可以使用 Compute Engine 为为作业创建的永久性磁盘和虚拟机添加、更新或移除标签,但不建议这样做。无法可靠地估算作业资源的存在时间范围,并且任何更改都可能无法与批处理正确配合使用。

  • 如需使用标签过滤作业列表,您必须使用 gcloud CLI 或批处理 API 查看作业列表

预定义标签

每个预定义标签都有一个以 batch- 前缀开头的键。默认情况下,批处理会自动应用以下预定义标签:

  • 对于您创建的每个作业:

    • batch-job-id:此标签的值设为作业的名称。
  • 针对为作业创建的每个 GPU、永久性磁盘和虚拟机:

    • batch-job-id:此标签的值设为作业的名称。

    • batch-job-uid:此标签的值设置为相应职位的唯一标识符 (UID)。

    • batch-node:此标签的值为 null,它只是对为作业创建的所有 GPU、永久性磁盘和虚拟机进行分组。例如,在查看 Cloud Billing 报告时,您可以使用此标签来确定由批处理作业创建的所有 GPU、永久性磁盘和虚拟机的费用。

定义自定义标签

创建作业时,您可以选择定义一个或多个自定义标签。您可以使用新密钥或项目已使用的密钥定义自定义标签。如需定义自定义标签,请根据标签的用途,从本文档中选择一种或多种方法:

  • 为作业及其资源定义自定义标签

    本部分介绍了如何将一个或多个自定义标签应用于作业以及为作业创建的每个 GPU、永久性磁盘和虚拟机。创建作业后,您可以使用这些标签来过滤 Cloud Billing 报告以及项目的作业、永久性磁盘和虚拟机列表。

  • 为作业定义自定义标签

    本部分介绍了如何向作业应用一个或多个自定义标签。创建作业后,您可以使用这些标签过滤项目的作业列表。

  • 为可运行项定义自定义标签

    本部分介绍了如何将一个或多个自定义标签应用于作业的一个或多个可运行作业。创建作业后,您可以使用这些标签来过滤项目的作业列表。

为作业及其资源定义自定义标签

labels 字段中为作业的分配政策定义的标签会应用于作业,以及为作业创建的每个 GPU(如果有)、永久性磁盘(所有启动磁盘和任何新的存储卷)和虚拟机。

使用 gcloud CLI 或 Batch API 创建作业时,您可以为作业及其资源定义标签。

gcloud

例如,如需在 us-central1 中创建一个基本容器作业,其中定义了两个适用于作业及其为作业创建的资源的自定义标签,请按以下步骤操作:

  1. 创建一个 JSON 文件,用于指定作业的配置详细信息和 allocationPolicy.labels 字段

    {
      "allocationPolicy": {
        "instances": [
          {
            "policy": {
              "machineType": "e2-standard-4"
            }
          }
        ],
        "labels": {
          "VM_LABEL_NAME1": "VM_LABEL_VALUE1",
          "VM_LABEL_NAME2": "VM_LABEL_VALUE2"
        }
      },
      "taskGroups": [
        {
          "taskSpec": {
            "runnables": [
              {
                "container": {
                  "imageUri": "gcr.io/google-containers/busybox",
                  "entrypoint": "/bin/sh",
                  "commands": [
                    "-c",
                    "echo Hello world!"
                  ]
                }
              }
            ]
          }
        }
      ]
    }
    

    替换以下内容:

    • VM_LABEL_NAME1:要应用于为作业创建的虚拟机的第一个标签的名称。

    • VM_LABEL_VALUE1:要应用于为作业创建的虚拟机的第一个标签的值。

    • VM_LABEL_NAME2:要应用于为作业创建的虚拟机的第二个标签的名称。

    • VM_LABEL_VALUE2:要应用于为作业创建的虚拟机的第二个标签的值。

  2. 使用 gcloud batch jobs submit 命令us-central1 中创建作业。

    gcloud batch jobs submit example-job \
        --config=JSON_CONFIGURATION_FILE \
        --location=us-central1
    

    JSON_CONFIGURATION_FILE 替换为包含您在上一步中创建的作业配置详细信息的 JSON 文件的路径。

API

例如,如需在 us-central1 中创建一个基本容器作业,其中定义了两个应用于作业及其为作业创建的资源的自定义标签,请向 jobs.create 方法发出 POST 请求,并指定 allocationPolicy.labels 字段

POST https://batch.googleapis.com/v1/projects/example-project/locations/us-central1/jobs?job_id=example-job

{
  "allocationPolicy": {
    "instances": [
      {
        "policy": {
          "machineType": "e2-standard-4"
        }
      }
    ],
    "labels": {
      "VM_LABEL_NAME1": "VM_LABEL_VALUE1",
      "VM_LABEL_NAME2": "VM_LABEL_VALUE2"
    }
  },
  "taskGroups": [
    {
      "taskSpec": {
        "runnables": [
          {
            "container": {
              "imageUri": "gcr.io/google-containers/busybox",
              "entrypoint": "/bin/sh",
              "commands": [
                "-c",
                "echo Hello world!"
              ]
            }
          }
        ]
      }
    }
  ]
}

替换以下内容:

  • VM_LABEL_NAME1:要应用于为作业创建的虚拟机的第一个标签的名称。

  • VM_LABEL_VALUE1:要应用于为作业创建的虚拟机的第一个标签的值。

  • VM_LABEL_NAME2:要应用于为作业创建的虚拟机的第二个标签的名称。

  • VM_LABEL_VALUE2:要应用于为作业创建的虚拟机的第二个标签的值。

Java


import com.google.cloud.batch.v1.AllocationPolicy;
import com.google.cloud.batch.v1.BatchServiceClient;
import com.google.cloud.batch.v1.ComputeResource;
import com.google.cloud.batch.v1.CreateJobRequest;
import com.google.cloud.batch.v1.Job;
import com.google.cloud.batch.v1.LogsPolicy;
import com.google.cloud.batch.v1.Runnable;
import com.google.cloud.batch.v1.TaskGroup;
import com.google.cloud.batch.v1.TaskSpec;
import com.google.protobuf.Duration;
import java.io.IOException;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;

public class CreateBatchAllocationPolicyLabel {

  public static void main(String[] args)
      throws IOException, ExecutionException, InterruptedException, TimeoutException {
    // TODO(developer): Replace these variables before running the sample.
    // Project ID or project number of the Google Cloud project you want to use.
    String projectId = "YOUR_PROJECT_ID";
    // Name of the region you want to use to run the job. Regions that are
    // available for Batch are listed on: https://cloud.google.com/batch/docs/get-started#locations
    String region = "us-central1";
    // The name of the job that will be created.
    // It needs to be unique for each project and region pair.
    String jobName = "example-job";
    // Name of the label1 to be applied for your Job.
    String labelName1 = "VM_LABEL_NAME1";
    // Value for the label1 to be applied for your Job.
    String labelValue1 = "VM_LABEL_VALUE1";
    // Name of the label2 to be applied for your Job.
    String labelName2 = "VM_LABEL_NAME2";
    // Value for the label2 to be applied for your Job.
    String labelValue2 = "VM_LABEL_VALUE2";

    createBatchAllocationPolicyLabel(projectId, region, jobName, labelName1,
        labelValue1, labelName2, labelValue2);
  }

  // This method shows how to create a job with labels defined 
  // in the labels field of a job's allocation policy. These are 
  // applied to the job, as well as to each GPU (if any), persistent disk 
  // (all boot disks and any new storage volumes), and VM created for the job.
  public static Job createBatchAllocationPolicyLabel(String projectId, String region,
                               String jobName, String labelName1,
                               String labelValue1, String labelName2, String labelValue2)
      throws IOException, ExecutionException, InterruptedException, TimeoutException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests.
    try (BatchServiceClient batchServiceClient = BatchServiceClient.create()) {

      // Define what will be done as part of the job.
      Runnable runnable =
          Runnable.newBuilder()
              .setContainer(
                  Runnable.Container.newBuilder()
                      .setImageUri("gcr.io/google-containers/busybox")
                      .setEntrypoint("/bin/sh")
                      .addCommands("-c")
                      .addCommands(
                          "echo Hello world! This is task ${BATCH_TASK_INDEX}. "
                              + "This job has a total of ${BATCH_TASK_COUNT} tasks.")
                      .build())
              .build();

      // We can specify what resources are requested by each task.
      ComputeResource computeResource =
          ComputeResource.newBuilder()
              // In milliseconds per cpu-second. This means the task requires 50% of a single CPUs.
              .setCpuMilli(2000)
              // In MiB.
              .setMemoryMib(2000)
              .build();

      TaskSpec task =
          TaskSpec.newBuilder()
              // Jobs can be divided into tasks. In this case, we have only one task.
              .addRunnables(runnable)
              .setComputeResource(computeResource)
              .setMaxRetryCount(2)
              .setMaxRunDuration(Duration.newBuilder().setSeconds(3600).build())
              .build();

      // Tasks are grouped inside a job using TaskGroups.
      // Currently, it's possible to have only one task group.
      TaskGroup taskGroup = TaskGroup.newBuilder().setTaskCount(1).setTaskSpec(task).build();

      // Policies are used to define on what kind of virtual machines the tasks will run on.
      // In this case, we tell the system to use "e2-standard-4" machine type.
      // Read more about machine types here: https://cloud.google.com/compute/docs/machine-types
      AllocationPolicy.InstancePolicy instancePolicy =
          AllocationPolicy.InstancePolicy.newBuilder().setMachineType("e2-standard-4").build();

      AllocationPolicy allocationPolicy =
          AllocationPolicy.newBuilder()
              .addInstances(AllocationPolicy.InstancePolicyOrTemplate.newBuilder()
                  .setPolicy(instancePolicy)
                  .build())
              // Labels and their value to be applied to the job and its resources
              .putLabels(labelName1, labelValue1)
              .putLabels(labelName2, labelValue2)
              .build();

      Job job =
          Job.newBuilder()
              .addTaskGroups(taskGroup)
              .setAllocationPolicy(allocationPolicy)
              // We use Cloud Logging as it's an out of the box available option.
              .setLogsPolicy(LogsPolicy.newBuilder()
                      .setDestination(LogsPolicy.Destination.CLOUD_LOGGING).build())
              .build();

      CreateJobRequest createJobRequest =
          CreateJobRequest.newBuilder()
              // The job's parent is the region in which the job will run.
              .setParent(String.format("projects/%s/locations/%s", projectId, region))
              .setJob(job)
              .setJobId(jobName)
              .build();

      Job result =
          batchServiceClient
              .createJobCallable()
              .futureCall(createJobRequest)
              .get(5, TimeUnit.MINUTES);

      System.out.printf("Successfully created the job: %s", result.getName());

      return result;
    }
  }

}

Node.js

// Imports the Batch library
const batchLib = require('@google-cloud/batch');
const batch = batchLib.protos.google.cloud.batch.v1;

// Instantiates a client
const batchClient = new batchLib.v1.BatchServiceClient();

/**
 * TODO(developer): Update these variables before running the sample.
 */
// Project ID or project number of the Google Cloud project you want to use.
const projectId = await batchClient.getProjectId();
// Name of the region you want to use to run the job. Regions that are
// available for Batch are listed on: https://cloud.google.com/batch/docs/get-started#locations
const region = 'europe-central2';
// The name of the job that will be created.
// It needs to be unique for each project and region pair.
const jobName = 'batch-labels-allocation-job';
// Name of the label1 to be applied for your Job.
const labelName1 = 'vm_label_name_1';
// Value for the label1 to be applied for your Job.
const labelValue1 = 'vmLabelValue1';
// Name of the label2 to be applied for your Job.
const labelName2 = 'vm_label_name_2';
// Value for the label2 to be applied for your Job.
const labelValue2 = 'vmLabelValue2';

// Define what will be done as part of the job.
const runnable = new batch.Runnable({
  script: new batch.Runnable.Script({
    commands: ['-c', 'echo Hello world! This is task ${BATCH_TASK_INDEX}.'],
  }),
});

// Specify what resources are requested by each task.
const computeResource = new batch.ComputeResource({
  // In milliseconds per cpu-second. This means the task requires 50% of a single CPUs.
  cpuMilli: 500,
  // In MiB.
  memoryMib: 16,
});

const task = new batch.TaskSpec({
  runnables: [runnable],
  computeResource,
  maxRetryCount: 2,
  maxRunDuration: {seconds: 3600},
});

// Tasks are grouped inside a job using TaskGroups.
const group = new batch.TaskGroup({
  taskCount: 3,
  taskSpec: task,
});

// Policies are used to define on what kind of virtual machines the tasks will run on.
// In this case, we tell the system to use "e2-standard-4" machine type.
// Read more about machine types here: https://cloud.google.com/compute/docs/machine-types
const instancePolicy = new batch.AllocationPolicy.InstancePolicy({
  machineType: 'e2-standard-4',
});

const allocationPolicy = new batch.AllocationPolicy({
  instances: [{policy: instancePolicy}],
});
// Labels and their value to be applied to the job and its resources.
allocationPolicy.labels[labelName1] = labelValue1;
allocationPolicy.labels[labelName2] = labelValue2;

const job = new batch.Job({
  name: jobName,
  taskGroups: [group],
  labels: {env: 'testing', type: 'script'},
  allocationPolicy,
  // We use Cloud Logging as it's an option available out of the box
  logsPolicy: new batch.LogsPolicy({
    destination: batch.LogsPolicy.Destination.CLOUD_LOGGING,
  }),
});

// The job's parent is the project and region in which the job will run
const parent = `projects/${projectId}/locations/${region}`;

async function callCreateBatchLabelsAllocation() {
  // Construct request
  const request = {
    parent,
    jobId: jobName,
    job,
  };

  // Run request
  const [response] = await batchClient.createJob(request);
  console.log(JSON.stringify(response));
}

await callCreateBatchLabelsAllocation();

Python

from google.cloud import batch_v1


def create_job_with_custom_allocation_policy_labels(
    project_id: str, region: str, job_name: str, labels: dict
) -> batch_v1.Job:
    """
    This method shows the creation of a Batch job with custom labels which describe the allocation policy.
    Args:
        project_id (str): project ID or project number of the Cloud project you want to use.
        region (str): name of the region you want to use to run the job. Regions that are
            available for Batch are listed on: https://cloud.google.com/batch/docs/locations
        job_name (str): the name of the job that will be created.
        labels (dict): a dictionary of key-value pairs that will be used as labels
            E.g., {"label_key1": "label_value2", "label_key2": "label_value2"}
    Returns:
        batch_v1.Job: The created Batch job object containing configuration details.
    """
    client = batch_v1.BatchServiceClient()

    runnable = batch_v1.Runnable()
    runnable.container = batch_v1.Runnable.Container()
    runnable.container.image_uri = "gcr.io/google-containers/busybox"
    runnable.container.entrypoint = "/bin/sh"
    runnable.container.commands = [
        "-c",
        "echo Hello world!",
    ]

    # Create a task specification and assign the runnable and volume to it
    task = batch_v1.TaskSpec()
    task.runnables = [runnable]

    # Specify what resources are requested by each task.
    resources = batch_v1.ComputeResource()
    resources.cpu_milli = 2000  # in milliseconds per cpu-second. This means the task requires 2 whole CPUs.
    resources.memory_mib = 16  # in MiB
    task.compute_resource = resources

    task.max_retry_count = 2
    task.max_run_duration = "3600s"

    # Create a task group and assign the task specification to it
    group = batch_v1.TaskGroup()
    group.task_count = 3
    group.task_spec = task

    # Policies are used to define on what kind of virtual machines the tasks will run on.
    # In this case, we tell the system to use "e2-standard-4" machine type.
    # Read more about machine types here: https://cloud.google.com/compute/docs/machine-types
    policy = batch_v1.AllocationPolicy.InstancePolicy()
    policy.machine_type = "e2-standard-4"
    instances = batch_v1.AllocationPolicy.InstancePolicyOrTemplate()
    instances.policy = policy
    allocation_policy = batch_v1.AllocationPolicy()
    allocation_policy.instances = [instances]

    # Assign the provided labels to the allocation policy
    allocation_policy.labels = labels

    # Create the job and assign the task group and allocation policy to it
    job = batch_v1.Job()
    job.task_groups = [group]
    job.allocation_policy = allocation_policy

    # We use Cloud Logging as it's an out of the box available option
    job.logs_policy = batch_v1.LogsPolicy()
    job.logs_policy.destination = batch_v1.LogsPolicy.Destination.CLOUD_LOGGING

    # Create the job request and set the job and job ID
    create_request = batch_v1.CreateJobRequest()
    create_request.job = job
    create_request.job_id = job_name
    # The job's parent is the region in which the job will run
    create_request.parent = f"projects/{project_id}/locations/{region}"

    return client.create_job(create_request)

为作业定义自定义标签

作业的 labels 字段中定义的标签仅应用于作业。

您可以在使用 gcloud CLI 或 Batch API 创建作业时为作业定义标签。

gcloud

例如,如需在 us-central1 中创建一个基本容器作业,并定义两个应用于作业本身的自定义标签,请按以下步骤操作:

  1. 创建一个 JSON 文件,用于指定作业的配置详细信息和 labels 字段

    {
      "taskGroups": [
        {
          "taskSpec": {
            "runnables": [
              {
                "container": {
                  "imageUri": "gcr.io/google-containers/busybox",
                  "entrypoint": "/bin/sh",
                  "commands": [
                    "-c",
                    "echo Hello World!"
                  ]
                }
              }
            ]
          }
        }
      ],
      "labels": {
        "JOB_LABEL_NAME1": "JOB_LABEL_VALUE1",
        "JOB_LABEL_NAME2": "JOB_LABEL_VALUE2"
      }
    }
    

    替换以下内容:

    • JOB_LABEL_NAME1:要应用于作业的第一个标签的名称。

    • JOB_LABEL_VALUE1:要应用于作业的第一个标签的值。

    • JOB_LABEL_NAME2:要应用于作业的第二个标签的名称。

    • JOB_LABEL_VALUE2:要应用于作业的第二个标签的值。

  2. 使用带有以下标志的 gcloud batch jobs submit 命令us-central1 中创建作业:

    gcloud batch jobs submit example-job \
        --config=JSON_CONFIGURATION_FILE \
        --location=us-central1
    

    JSON_CONFIGURATION_FILE 替换为包含您在上一步中创建的作业配置详细信息的 JSON 文件的路径。

API

例如,如需在 us-central1 中创建一个容器作业,并定义要应用于作业本身的两个自定义标签,请向 jobs.create 方法发出 POST 请求,并指定 labels 字段

POST https://batch.googleapis.com/v1/projects/example-project/locations/us-central1/jobs?job_id=example-job

{
  "taskGroups": [
    {
      "taskSpec": {
        "runnables": [
          {
            "container": {
              "imageUri": "gcr.io/google-containers/busybox",
              "entrypoint": "/bin/sh",
              "commands": [
                "-c",
                "echo Hello World!"
              ]
            }
          }
        ]
      }
    }
  ],
  "labels": {
    "JOB_LABEL_NAME1": "JOB_LABEL_VALUE1",
    "JOB_LABEL_NAME2": "JOB_LABEL_VALUE2"
  }
}

替换以下内容:

  • JOB_LABEL_NAME1:要应用于作业的第一个标签的名称。

  • JOB_LABEL_VALUE1:要应用于作业的第一个标签的值。

  • JOB_LABEL_NAME2:要应用于作业的第二个标签的名称。

  • JOB_LABEL_VALUE2:要应用于作业的第二个标签的值。

Java


import com.google.cloud.batch.v1.BatchServiceClient;
import com.google.cloud.batch.v1.ComputeResource;
import com.google.cloud.batch.v1.CreateJobRequest;
import com.google.cloud.batch.v1.Job;
import com.google.cloud.batch.v1.LogsPolicy;
import com.google.cloud.batch.v1.Runnable;
import com.google.cloud.batch.v1.TaskGroup;
import com.google.cloud.batch.v1.TaskSpec;
import com.google.protobuf.Duration;
import java.io.IOException;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;


public class CreateBatchLabelJob {

  public static void main(String[] args)
      throws IOException, ExecutionException, InterruptedException, TimeoutException {
    // TODO(developer): Replace these variables before running the sample.
    // Project ID or project number of the Google Cloud project you want to use.
    String projectId = "YOUR_PROJECT_ID";
    // Name of the region you want to use to run the job. Regions that are
    // available for Batch are listed on: https://cloud.google.com/batch/docs/get-started#locations
    String region = "us-central1";
    // The name of the job that will be created.
    // It needs to be unique for each project and region pair.
    String jobName = "example-job";
    // Name of the label1 to be applied for your Job.
    String labelName1 = "JOB_LABEL_NAME1";
    // Value for the label1 to be applied for your Job.
    String labelValue1 = "JOB_LABEL_VALUE1";
    // Name of the label2 to be applied for your Job.
    String labelName2 = "JOB_LABEL_NAME2";
    // Value for the label2 to be applied for your Job.
    String labelValue2 = "JOB_LABEL_VALUE2";

    createBatchLabelJob(projectId, region, jobName, labelName1,
        labelValue1, labelName2, labelValue2);
  }

  // Creates a job with labels defined in the labels field.
  public static Job createBatchLabelJob(String projectId, String region, String jobName,
                    String labelName1, String labelValue1, String labelName2, String labelValue2)
      throws IOException, ExecutionException, InterruptedException, TimeoutException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests.
    try (BatchServiceClient batchServiceClient = BatchServiceClient.create()) {

      // Define what will be done as part of the job.
      Runnable runnable =
          Runnable.newBuilder()
              .setContainer(
                  Runnable.Container.newBuilder()
                      .setImageUri("gcr.io/google-containers/busybox")
                      .setEntrypoint("/bin/sh")
                      .addCommands("-c")
                      .addCommands(
                          "echo Hello world! This is task ${BATCH_TASK_INDEX}. "
                              + "This job has a total of ${BATCH_TASK_COUNT} tasks.")
                      .build())
              .build();

      // We can specify what resources are requested by each task.
      ComputeResource computeResource =
          ComputeResource.newBuilder()
              // In milliseconds per cpu-second. This means the task requires 50% of a single CPUs.
              .setCpuMilli(2000)
              // In MiB.
              .setMemoryMib(2000)
              .build();

      TaskSpec task =
          TaskSpec.newBuilder()
              // Jobs can be divided into tasks. In this case, we have only one task.
              .addRunnables(runnable)
              .setComputeResource(computeResource)
              .setMaxRetryCount(2)
              .setMaxRunDuration(Duration.newBuilder().setSeconds(3600).build())
              .build();

      // Tasks are grouped inside a job using TaskGroups.
      // Currently, it's possible to have only one task group.
      TaskGroup taskGroup = TaskGroup.newBuilder().setTaskCount(1).setTaskSpec(task).build();

      Job job =
          Job.newBuilder()
              .addTaskGroups(taskGroup)
              // We use Cloud Logging as it's an out of the box available option.
              .setLogsPolicy(LogsPolicy.newBuilder()
              .setDestination(LogsPolicy.Destination.CLOUD_LOGGING).build())
              // Labels and their value to be applied to the job.
              .putLabels(labelName1, labelValue1)
              .putLabels(labelName2, labelValue2)
              .build();

      CreateJobRequest createJobRequest =
          CreateJobRequest.newBuilder()
              // The job's parent is the region in which the job will run.
              .setParent(String.format("projects/%s/locations/%s", projectId, region))
              .setJob(job)
              .setJobId(jobName)
              .build();

      Job result =
          batchServiceClient
              .createJobCallable()
              .futureCall(createJobRequest)
              .get(5, TimeUnit.MINUTES);

      System.out.printf("Successfully created the job: %s", result.getName());

      return result;
    }
  }

}

Node.js

// Imports the Batch library
const batchLib = require('@google-cloud/batch');
const batch = batchLib.protos.google.cloud.batch.v1;

// Instantiates a client
const batchClient = new batchLib.v1.BatchServiceClient();

/**
 * TODO(developer): Update these variables before running the sample.
 */
// Project ID or project number of the Google Cloud project you want to use.
const projectId = await batchClient.getProjectId();
// Name of the region you want to use to run the job. Regions that are
// available for Batch are listed on: https://cloud.google.com/batch/docs/get-started#locations
const region = 'europe-central2';
// The name of the job that will be created.
// It needs to be unique for each project and region pair.
const jobName = 'batch-labels-job';
// Name of the label1 to be applied for your Job.
const labelName1 = 'job_label_name_1';
// Value for the label1 to be applied for your Job.
const labelValue1 = 'job_label_value1';
// Name of the label2 to be applied for your Job.
const labelName2 = 'job_label_name_2';
// Value for the label2 to be applied for your Job.
const labelValue2 = 'job_label_value2';

// Define what will be done as part of the job.
const runnable = new batch.Runnable({
  container: new batch.Runnable.Container({
    imageUri: 'gcr.io/google-containers/busybox',
    entrypoint: '/bin/sh',
    commands: ['-c', 'echo Hello world! This is task ${BATCH_TASK_INDEX}.'],
  }),
});

// Specify what resources are requested by each task.
const computeResource = new batch.ComputeResource({
  // In milliseconds per cpu-second. This means the task requires 50% of a single CPUs.
  cpuMilli: 500,
  // In MiB.
  memoryMib: 16,
});

const task = new batch.TaskSpec({
  runnables: [runnable],
  computeResource,
  maxRetryCount: 2,
  maxRunDuration: {seconds: 3600},
});

// Tasks are grouped inside a job using TaskGroups.
const group = new batch.TaskGroup({
  taskCount: 3,
  taskSpec: task,
});

const job = new batch.Job({
  name: jobName,
  taskGroups: [group],
  // We use Cloud Logging as it's an option available out of the box
  logsPolicy: new batch.LogsPolicy({
    destination: batch.LogsPolicy.Destination.CLOUD_LOGGING,
  }),
});

// Labels and their value to be applied to the job and its resources.
job.labels[labelName1] = labelValue1;
job.labels[labelName2] = labelValue2;

// The job's parent is the project and region in which the job will run
const parent = `projects/${projectId}/locations/${region}`;

async function callCreateBatchLabelsJob() {
  // Construct request
  const request = {
    parent,
    jobId: jobName,
    job,
  };

  // Run request
  const [response] = await batchClient.createJob(request);
  console.log(JSON.stringify(response));
}

await callCreateBatchLabelsJob();

Python

from google.cloud import batch_v1


def create_job_with_custom_job_labels(
    project_id: str,
    region: str,
    job_name: str,
    labels: dict,
) -> batch_v1.Job:
    """
    This method creates a Batch job with custom labels.
    Args:
        project_id (str): project ID or project number of the Cloud project you want to use.
        region (str): name of the region you want to use to run the job. Regions that are
            available for Batch are listed on: https://cloud.google.com/batch/docs/locations
        job_name (str): the name of the job that will be created.
        labels (dict): A dictionary of custom labels to be added to the job.
            E.g., {"label_key1": "label_value2", "label_key2": "label_value2"}
    Returns:
        batch_v1.Job: The created Batch job object containing configuration details.
    """
    client = batch_v1.BatchServiceClient()

    runnable = batch_v1.Runnable()
    runnable.container = batch_v1.Runnable.Container()
    runnable.container.image_uri = "gcr.io/google-containers/busybox"
    runnable.container.entrypoint = "/bin/sh"
    runnable.container.commands = [
        "-c",
        "echo Hello world!",
    ]

    # Create a task specification and assign the runnable and volume to it
    task = batch_v1.TaskSpec()
    task.runnables = [runnable]

    # Specify what resources are requested by each task.
    resources = batch_v1.ComputeResource()
    resources.cpu_milli = 2000  # in milliseconds per cpu-second. This means the task requires 2 whole CPUs.
    resources.memory_mib = 16  # in MiB
    task.compute_resource = resources

    task.max_retry_count = 2
    task.max_run_duration = "3600s"

    # Create a task group and assign the task specification to it
    group = batch_v1.TaskGroup()
    group.task_count = 3
    group.task_spec = task

    # Policies are used to define on what kind of virtual machines the tasks will run on.
    # In this case, we tell the system to use "e2-standard-4" machine type.
    # Read more about machine types here: https://cloud.google.com/compute/docs/machine-types
    policy = batch_v1.AllocationPolicy.InstancePolicy()
    policy.machine_type = "e2-standard-4"
    instances = batch_v1.AllocationPolicy.InstancePolicyOrTemplate()
    instances.policy = policy
    allocation_policy = batch_v1.AllocationPolicy()
    allocation_policy.instances = [instances]

    # Create the job and assign the task group and allocation policy to it
    job = batch_v1.Job()
    job.task_groups = [group]
    job.allocation_policy = allocation_policy

    # Set the labels for the job
    job.labels = labels

    # We use Cloud Logging as it's an out of the box available option
    job.logs_policy = batch_v1.LogsPolicy()
    job.logs_policy.destination = batch_v1.LogsPolicy.Destination.CLOUD_LOGGING

    # Create the job request and set the job and job ID
    create_request = batch_v1.CreateJobRequest()
    create_request.job = job
    create_request.job_id = job_name
    # The job's parent is the region in which the job will run
    create_request.parent = f"projects/{project_id}/locations/{region}"

    return client.create_job(create_request)

为可运行文件定义自定义标签

labels 字段中为可运行项定义的标签仅应用于该可运行项。

使用 gcloud CLI 或 Batch API 创建作业时,您可以为一个或多个可运行作业定义标签。

gcloud

例如,如需在 us-central1 中创建一个作业,并为该作业的两个可运行对象分别定义一个自定义标签,请按以下步骤操作:

  1. 创建一个 JSON 文件,用于指定作业的配置详细信息和 runnables.labels 字段

    {
      "taskGroups": [
        {
          "taskSpec": {
            "runnables": [
              {
                "container": {
                  "imageUri": "gcr.io/google-containers/busybox",
                  "entrypoint": "/bin/sh",
                  "commands": [
                    "-c",
                    "echo Hello from task ${BATCH_TASK_INDEX}!"
                  ]
                },
                "labels": {
                  "RUNNABLE1_LABEL_NAME1": "RUNNABLE1_LABEL_VALUE1"
                }
              },
              {
                "script": {
                  "text": "echo Hello from task ${BATCH_TASK_INDEX}!"
                },
                "labels": {
                  "RUNNABLE2_LABEL_NAME1": "RUNNABLE2_LABEL_VALUE1"
                }
              }
            ]
          }
        }
      ]
    }
    

    替换以下内容:

    • RUNNABLE1_LABEL_NAME1:要应用于作业的第一个可运行项的标签的名称。

    • RUNNABLE1_LABEL_VALUE1:要应用于作业的第一个可运行项的标签的值。

    • RUNNABLE2_LABEL_NAME1:要应用于作业的第二个可运行项的标签的名称。

    • RUNNABLE2_LABEL_VALUE1:要应用于作业的第二个可运行项的标签的值。

  2. 使用 gcloud batch jobs submit 命令us-central1 中创建作业。

    gcloud batch jobs submit example-job \
        --config=JSON_CONFIGURATION_FILE \
        --location=us-central1
    

    JSON_CONFIGURATION_FILE 替换为包含您在上一步中创建的作业配置详细信息的 JSON 文件的路径。

API

例如,如需在 us-central1 中创建一项作业来定义两个自定义标签(每个作业的可运行对象各有一个),请向 jobs.create 方法发出 POST 请求,并指定 runnables.labels 字段

POST https://batch.googleapis.com/v1/projects/example-project/locations/us-central1/jobs?job_id=example-job

{
  "taskGroups": [
    {
      "taskSpec": {
        "runnables": [
          {
            "container": {
              "imageUri": "gcr.io/google-containers/busybox",
              "entrypoint": "/bin/sh",
              "commands": [
                "-c",
                "echo Hello from ${BATCH_TASK_INDEX}!"
              ]
            },
            "labels": {
              "RUNNABLE1_LABEL_NAME1": "RUNNABLE1_LABEL_VALUE1"
            }
          },
          {
            "script": {
              "text": "echo Hello from ${BATCH_TASK_INDEX}!"
            },
            "labels": {
              "RUNNABLE2_LABEL_NAME1": "RUNNABLE2_LABEL_VALUE1"
            }
          }
        ]
      }
    }
  ]
}

替换以下内容:

  • RUNNABLE1_LABEL_NAME1:要应用于第一个作业的可运行项的标签的名称。

  • RUNNABLE1_LABEL_VALUE1:要应用于第一个作业的可运行项的标签的值。

  • RUNNABLE2_LABEL_NAME1:要应用于第二个作业的可运行项的标签的名称。

  • RUNNABLE2_LABEL_VALUE1:要应用于第二个作业的可运行项的标签的值。

Java


import com.google.cloud.batch.v1.BatchServiceClient;
import com.google.cloud.batch.v1.ComputeResource;
import com.google.cloud.batch.v1.CreateJobRequest;
import com.google.cloud.batch.v1.Job;
import com.google.cloud.batch.v1.LogsPolicy;
import com.google.cloud.batch.v1.Runnable;
import com.google.cloud.batch.v1.TaskGroup;
import com.google.cloud.batch.v1.TaskSpec;
import com.google.protobuf.Duration;
import java.io.IOException;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;

public class CreateBatchRunnableLabel {
  public static void main(String[] args)
      throws IOException, ExecutionException, InterruptedException, TimeoutException {
    // TODO(developer): Replace these variables before running the sample.
    // Project ID or project number of the Google Cloud project you want to use.
    String projectId = "YOUR_PROJECT_ID";
    // Name of the region you want to use to run the job. Regions that are
    // available for Batch are listed on: https://cloud.google.com/batch/docs/get-started#locations
    String region = "us-central1";
    // The name of the job that will be created.
    // It needs to be unique for each project and region pair.
    String jobName = "example-job";
    // Name of the label1 to be applied for your Job.
    String labelName1 = "RUNNABLE_LABEL_NAME1";
    // Value for the label1 to be applied for your Job.
    String labelValue1 = "RUNNABLE_LABEL_VALUE1";
    // Name of the label2 to be applied for your Job.
    String labelName2 = "RUNNABLE_LABEL_NAME2";
    // Value for the label2 to be applied for your Job.
    String labelValue2 = "RUNNABLE_LABEL_VALUE2";

    createBatchRunnableLabel(projectId, region, jobName, labelName1,
        labelValue1, labelName2, labelValue2);
  }

  // Creates a job with labels defined in the labels field
  // for a runnable. The labels are only applied to that runnable.
  // In Batch, a runnable represents a single task or unit of work within a job.
  // It can be a container (like a Docker image) or a script.
  public static Job createBatchRunnableLabel(String projectId, String region, String jobName,
                   String labelName1, String labelValue1, String labelName2, String labelValue2)
      throws IOException, ExecutionException, InterruptedException, TimeoutException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests.
    try (BatchServiceClient batchServiceClient = BatchServiceClient.create()) {

      // Define what will be done as part of the job.
      Runnable runnable =
          Runnable.newBuilder()
              .setContainer(
                  Runnable.Container.newBuilder()
                      .setImageUri("gcr.io/google-containers/busybox")
                      .setEntrypoint("/bin/sh")
                      .addCommands("-c")
                      .addCommands(
                          "echo Hello world! This is task ${BATCH_TASK_INDEX}. "
                              + "This job has a total of ${BATCH_TASK_COUNT} tasks.")
                      .build())
              // Label and its value to be applied to the container
              // that processes data from a specific region.
              .putLabels(labelName1, labelValue1)
              .setScript(Runnable.Script.newBuilder()
              .setText("echo Hello world! This is task ${BATCH_TASK_INDEX}. ").build())
              // Label and its value to be applied to the script
              // that performs some analysis on the processed data.
              .putLabels(labelName2, labelValue2)
              .build();

      // We can specify what resources are requested by each task.
      ComputeResource computeResource =
          ComputeResource.newBuilder()
              // In milliseconds per cpu-second. This means the task requires 50% of a single CPUs.
              .setCpuMilli(2000)
              // In MiB.
              .setMemoryMib(2000)
              .build();

      TaskSpec task =
          TaskSpec.newBuilder()
              // Jobs can be divided into tasks. In this case, we have only one task.
              .addRunnables(runnable)
              .setComputeResource(computeResource)
              .setMaxRetryCount(2)
              .setMaxRunDuration(Duration.newBuilder().setSeconds(3600).build())
              .build();

      // Tasks are grouped inside a job using TaskGroups.
      // Currently, it's possible to have only one task group.
      TaskGroup taskGroup = TaskGroup.newBuilder().setTaskCount(1).setTaskSpec(task).build();

      Job job =
          Job.newBuilder()
              .addTaskGroups(taskGroup)
              // We use Cloud Logging as it's an out of the box available option.
              .setLogsPolicy(LogsPolicy.newBuilder()
              .setDestination(LogsPolicy.Destination.CLOUD_LOGGING).build())
              .build();

      CreateJobRequest createJobRequest =
          CreateJobRequest.newBuilder()
              // The job's parent is the region in which the job will run for the specific project.
              .setParent(String.format("projects/%s/locations/%s", projectId, region))
              .setJob(job)
              .setJobId(jobName)
              .build();

      Job result =
          batchServiceClient
              .createJobCallable()
              .futureCall(createJobRequest)
              .get(5, TimeUnit.MINUTES);

      System.out.printf("Successfully created the job: %s", result.getName());

      return result;
    }
  }

}

Node.js

// Imports the Batch library
const batchLib = require('@google-cloud/batch');
const batch = batchLib.protos.google.cloud.batch.v1;

// Instantiates a client
const batchClient = new batchLib.v1.BatchServiceClient();

/**
 * TODO(developer): Update these variables before running the sample.
 */
// Project ID or project number of the Google Cloud project you want to use.
const projectId = await batchClient.getProjectId();
// Name of the region you want to use to run the job. Regions that are
// available for Batch are listed on: https://cloud.google.com/batch/docs/get-started#locations
const region = 'us-central1';
// The name of the job that will be created.
// It needs to be unique for each project and region pair.
const jobName = 'example-job';
// Name of the label1 to be applied for your Job.
const labelName1 = 'RUNNABLE_LABEL_NAME1';
// Value for the label1 to be applied for your Job.
const labelValue1 = 'RUNNABLE_LABEL_VALUE1';
// Name of the label2 to be applied for your Job.
const labelName2 = 'RUNNABLE_LABEL_NAME2';
// Value for the label2 to be applied for your Job.
const labelValue2 = 'RUNNABLE_LABEL_VALUE2';

const container = new batch.Runnable.Container({
  imageUri: 'gcr.io/google-containers/busybox',
  entrypoint: '/bin/sh',
  commands: ['-c', 'echo Hello world! This is task ${BATCH_TASK_INDEX}.'],
});

const script = new batch.Runnable.Script({
  commands: ['-c', 'echo Hello world! This is task ${BATCH_TASK_INDEX}.'],
});

const runnable1 = new batch.Runnable({
  container,
  // Label and its value to be applied to the container
  // that processes data from a specific region.
  labels: {
    [labelName1]: labelValue1,
  },
});

const runnable2 = new batch.Runnable({
  script,
  // Label and its value to be applied to the script
  // that performs some analysis on the processed data.
  labels: {
    [labelName2]: labelValue2,
  },
});

// Specify what resources are requested by each task.
const computeResource = new batch.ComputeResource({
  // In milliseconds per cpu-second. This means the task requires 50% of a single CPUs.
  cpuMilli: 500,
  // In MiB.
  memoryMib: 16,
});

const task = new batch.TaskSpec({
  runnables: [runnable1, runnable2],
  computeResource,
  maxRetryCount: 2,
  maxRunDuration: {seconds: 3600},
});

// Tasks are grouped inside a job using TaskGroups.
const group = new batch.TaskGroup({
  taskCount: 3,
  taskSpec: task,
});

const job = new batch.Job({
  name: jobName,
  taskGroups: [group],
  // We use Cloud Logging as it's an option available out of the box
  logsPolicy: new batch.LogsPolicy({
    destination: batch.LogsPolicy.Destination.CLOUD_LOGGING,
  }),
});

// The job's parent is the project and region in which the job will run
const parent = `projects/${projectId}/locations/${region}`;

async function callCreateBatchLabelsRunnable() {
  // Construct request
  const request = {
    parent,
    jobId: jobName,
    job,
  };

  // Run request
  const [response] = await batchClient.createJob(request);
  console.log(JSON.stringify(response));
}

await callCreateBatchLabelsRunnable();

Python

from google.cloud import batch_v1


def create_job_with_custom_runnables_labels(
    project_id: str,
    region: str,
    job_name: str,
    labels: dict,
) -> batch_v1.Job:
    """
    This method creates a Batch job with custom labels for runnable.
    Args:
        project_id (str): project ID or project number of the Cloud project you want to use.
        region (str): name of the region you want to use to run the job. Regions that are
            available for Batch are listed on: https://cloud.google.com/batch/docs/locations
        job_name (str): the name of the job that will be created.
        labels (dict): a dictionary of key-value pairs that will be used as labels
            E.g., {"label_key1": "label_value2"}
    Returns:
        batch_v1.Job: The created Batch job object containing configuration details.
    """
    client = batch_v1.BatchServiceClient()

    runnable = batch_v1.Runnable()
    runnable.display_name = "Script 1"
    runnable.script = batch_v1.Runnable.Script()
    runnable.script.text = "echo Hello world from Script 1 for task ${BATCH_TASK_INDEX}"
    # Add custom labels to the first runnable
    runnable.labels = labels

    # Create a task specification and assign the runnable and volume to it
    task = batch_v1.TaskSpec()
    task.runnables = [runnable]

    # Specify what resources are requested by each task.
    resources = batch_v1.ComputeResource()
    resources.cpu_milli = 2000  # in milliseconds per cpu-second. This means the task requires 2 whole CPUs.
    resources.memory_mib = 16  # in MiB
    task.compute_resource = resources

    task.max_retry_count = 2
    task.max_run_duration = "3600s"

    # Create a task group and assign the task specification to it
    group = batch_v1.TaskGroup()
    group.task_count = 3
    group.task_spec = task

    # Policies are used to define on what kind of virtual machines the tasks will run on.
    # In this case, we tell the system to use "e2-standard-4" machine type.
    # Read more about machine types here: https://cloud.google.com/compute/docs/machine-types
    policy = batch_v1.AllocationPolicy.InstancePolicy()
    policy.machine_type = "e2-standard-4"
    instances = batch_v1.AllocationPolicy.InstancePolicyOrTemplate()
    instances.policy = policy
    allocation_policy = batch_v1.AllocationPolicy()
    allocation_policy.instances = [instances]

    # Create the job and assign the task group and allocation policy to it
    job = batch_v1.Job()
    job.task_groups = [group]
    job.allocation_policy = allocation_policy

    # We use Cloud Logging as it's an out of the box available option
    job.logs_policy = batch_v1.LogsPolicy()
    job.logs_policy.destination = batch_v1.LogsPolicy.Destination.CLOUD_LOGGING

    # Create the job request and set the job and job ID
    create_request = batch_v1.CreateJobRequest()
    create_request.job = job
    create_request.job_id = job_name
    # The job's parent is the region in which the job will run
    create_request.parent = f"projects/{project_id}/locations/{region}"

    return client.create_job(create_request)

后续步骤