创建和运行基本作业

本文档介绍了批量作业创建的基础知识: 如何创建和运行基于脚本或容器映像的作业 并使用预定义变量和自定义变量 如需详细了解如何创建和运行作业,请参阅作业创建和执行概览

准备工作

  1. 如果您以前没有使用过 Batch,请参阅 Batch 使用入门 并通过填写 针对项目和用户的前提条件
  2. 如需获取创建作业所需的权限, 请让管理员授予您 以下 IAM 角色:

    如需详细了解如何授予角色,请参阅管理对项目、文件夹和组织的访问权限

    您也可以通过自定义角色或其他预定义角色来获取所需的权限。

  3. 每次创建作业时,请确保该作业具有有效的网络配置。
    • 如果您的工作负载没有任何特定的网络要求, 项目,并且尚未修改项目的默认网络, 您无需执行任何操作。
    • 否则,您需要在创建作业时配置网络。了解如何 请先为作业配置网络,然后再创建基本作业, 请修改以下示例,使其符合您的网络要求。
    有关作业网络配置的更多信息,请参阅 批量网络概览
  4. 每次创建作业时,请确保该作业具有有效的虚拟机 操作系统 (OS) 环境。
    • 如果您对自己的虚拟机没有任何特定的虚拟机操作系统映像或启动磁盘要求 则无需执行任何操作。
    • 否则,您需要准备有效的虚拟机操作系统环境选项。 在创建基本作业之前,允许默认配置 或了解如何自定义虚拟机操作系统环境 因此您可以根据自己的要求修改以下示例
    如需详细了解作业的虚拟机操作系统环境,请参阅虚拟机操作系统环境概览

创建基本作业

有关您可以为作业指定的所有字段的信息,请参见 参考文档 projects.locations.jobs REST 资源。 总而言之,作业由由一个或多个任务组成的数组 让它们在运行一个或多个 runnables、 它们是作业的可执行脚本和/或容器。 为了介绍基础知识,本部分介绍了如何使用 一个可运行对象(脚本或容器映像):

  • 如果您想使用 Batch 写入作业 请参阅创建容器作业
  • 否则,如果您不确定是否要使用容器映像,或者 但您不熟悉容器,不妨创建脚本作业

这两种类型作业的示例作业都包含一个包含 4 个任务的任务组。每个任务都会按照 和 Cloud Logging此作业的定义指定 并行处理值为 2,表示作业应在 2 个虚拟机上运行, 一次运行 2 个任务。

创建基本容器作业

您可以选择或创建容器映像,以提供代码和依赖项 让作业能够在任何计算环境中运行如需了解详情,请参阅 使用容器映像在虚拟机实例上运行容器

您可以使用 Google Cloud 控制台创建基本的容器作业, gcloud CLI、Batch API、Go、Java、Node.js、Python 或 C++。

控制台

如需使用 Google Cloud 控制台创建基本的容器作业,请执行以下操作: 以下:

  1. 在 Google Cloud 控制台中,前往作业列表页面。

    转到作业列表

  2. 点击 创建。通过 此时会打开创建批量作业页面。在左侧窗格中, 作业详情页面处于选中状态。

  3. 配置作业详情页面:

    1. 可选:在作业名称字段中,自定义作业名称。

      例如,输入 example-basic-job

    2. 配置任务详情部分:

      1. 新建可运行对象窗口中,添加至少一个脚本 或容器来运行此作业。

        例如,如需添加一个容器,请执行以下操作:

        1. 选择容器映像网址(默认)。

        2. 容器映像网址字段中,输入 这个容器映像中每个任务要运行的 作业。

          例如,要使用 busybox Docker 容器 映像,请输入 以下网址:

          gcr.io/google-containers/busybox
          
        3. 可选:如需替换容器映像的 ENTRYPOINT 命令,请输入一个命令 入口点字段。

          例如,输入以下内容:

          /bin/sh
          
        4. 可选:如需替换容器映像的 CMD 命令,请执行以下操作: 执行以下操作:

          1. 选择替换容器映像的 CMD 命令 复选框。系统会显示一个文本框。

          2. 在文本框中,输入一个或多个命令,并用新行将每个命令分隔开来。

            例如,输入以下命令:

            -c
            echo Hello world! This is task ${BATCH_TASK_INDEX}. This job has a total of ${BATCH_TASK_COUNT} tasks.
            
          3. 点击完成

      2. 任务数量字段中,输入此作业的任务数量。该值必须是介于 1每个任务组的任务数限制之间的整数。

        例如,输入 4

      3. 并行数量字段中,输入要运行的任务数量 同时进行。该数字不能大于总数 任务数,并且必须是 1每个作业的并行任务数上限之间的整数。

        例如,输入 2

  4. 配置 Resource specifications 页面:

    1. 在左侧窗格中,点击资源规范。 系统随即会打开资源规范页面。

    2. 虚拟机预配模型部分中,选择 针对 预配模型 此作业的虚拟机:

      • 如果您的作业可以承受抢占,并且您希望 使用折扣虚拟机,请选择 Spot

      • 否则,请选择标准

      例如,选择标准(默认)。

    3. 选择此作业的位置

      1. 区域字段中,选择一个区域。

        例如,选择 us-central1 (Iowa)(默认)。

      2. 可用区字段中,执行以下操作之一:

        • 如果要限制此作业的运行 请选择区域。

        • 否则,请选择任意

        例如,选择任意(默认)。

    4. 请选择以下选项之一 机器家族

      • 对于常见工作负载,请点击通用

      • 对于需要高性能的工作负载,请点击计算优化

      • 对于内存密集型工作负载,请点击内存优化

      例如,点击通用(默认)。

    5. 系列字段中,选择一个 机器系列 此作业的虚拟机数量

      例如,如果您为广告系列选择了通用, 机器家族,选择 E2(默认)。

    6. 机器类型字段中,选择此机器类型 作业的虚拟机数量。

      例如,如果您为机器系列选择了 E2,请选择 e2-medium(2 个 vCPU、4 GB 内存)(默认)。

    7. 配置每项任务所需的虚拟机资源量:

      1. 核心字段中,输入 每个任务的 vCPUs 数量。

        例如,输入 1(默认值)。

      2. 内存字段中,输入每个任务的 RAM 容量(以 GB 为单位)。

        例如,输入 0.5(默认值)。

  5. 可选:如需查看作业配置,请在左侧窗格中 点击预览

  6. 点击创建

作业详情页面会显示您创建的作业。

gcloud

如需使用 gcloud CLI 创建基本的容器作业,请执行以下操作: 以下:

  1. 创建一个 JSON 文件,用于指定作业的配置详细信息。例如,如需创建基本容器作业,请创建一个包含以下内容的 JSON 文件。如需详细了解您可以为作业指定的所有字段,请参阅 projects.locations.jobs REST 资源的参考文档。

    {
        "taskGroups": [
            {
                "taskSpec": {
                    "runnables": [
                        {
                            "container": {
                                CONTAINER
                            }
                        }
                    ],
                    "computeResource": {
                        "cpuMilli": CORES,
                        "memoryMib": MEMORY
                    },
                    "maxRetryCount": MAX_RETRY_COUNT,
                    "maxRunDuration": "MAX_RUN_DURATION"
                },
                "taskCount": TASK_COUNT,
                "parallelism": PARALLELISM
            }
        ]
    }
    

    替换以下内容:

    • CONTAINER:每个任务运行的容器。 容器必须至少在 imageUri 中指定映像 子字段,但可能还需要其他子字段。有关 请参阅 container 个子字段 以及本部分中的示例容器作业
    • CORES:可选。为每项任务分配的核心数量(具体而言是 vCPU,通常表示物理核心的一半),以 milliCPU 为单位。如果未指定 cpuMilli 字段,则采用此值 设置为 2000(2 个 vCPU)。
    • MEMORY:可选。 为每项任务分配的内存(以 MB 为单位)。如果 memoryMib 字段不是 则该值设置为 2000 (2 GB)。
    • MAX_RETRY_COUNT:可选。数量上限 重试次数。该值必须是介于 0 之间的整数 和10。如果未指定 maxRetryCount 字段,则值为 设置为 0,表示不会重试任务。 如需详细了解 maxRetryCount 字段,请参阅自动重试任务
    • MAX_RUN_DURATION:可选。最长时间 允许任务在重试或失败之前运行,格式为 一个以秒为单位的值,后跟 s,例如 3600s 表示 1 小时。如果未指定 maxRunDuration 字段,则 值设置为 作业的最长运行时间。 如需详细了解 maxRunDuration 字段,请参阅 使用超时限制任务和可运行对象的运行时间
    • TASK_COUNT:可选。以下各项的任务数量: 作业。该值必须是介于 1每个任务组的任务数限制之间的整数。如果 未指定 taskCount 字段,则值设置为 1
    • PARALLELISM:可选。任务数量 同时运行该值不得大于任务数,并且必须介于 1每个作业的并行任务数上限之间的整数。如果 未指定“parallelism”字段,则值设置为 1
  2. 使用 gcloud batch jobs submit 命令创建作业。

    gcloud batch jobs submit JOB_NAME \
      --location LOCATION \
      --config JSON_CONFIGURATION_FILE
    

    替换以下内容:

    • JOB_NAME:作业的名称。
    • LOCATION:作业的位置
    • JSON_CONFIGURATION_FILE:JSON 路径 该文件包含作业的配置详情。

例如,若要创建使用 busybox Docker 容器映像运行任务的作业,请执行以下操作:

  1. 在名为 hello-world-container.json,其中包含以下内容:

    {
        "taskGroups": [
            {
                "taskSpec": {
                    "runnables": [
                        {
                            "container": {
                                "imageUri": "gcr.io/google-containers/busybox",
                                "entrypoint": "/bin/sh",
                                "commands": [
                                    "-c",
                                    "echo Hello world! This is task ${BATCH_TASK_INDEX}. This job has a total of ${BATCH_TASK_COUNT} tasks."
                                ]
                            }
                        }
                    ],
                    "computeResource": {
                        "cpuMilli": 2000,
                        "memoryMib": 16
                    },
                    "maxRetryCount": 2,
                    "maxRunDuration": "3600s"
                },
                "taskCount": 4,
                "parallelism": 2
            }
        ],
        "allocationPolicy": {
            "instances": [
                {
                    "policy": { "machineType": "e2-standard-4" }
                }
            ]
        },
        "labels": {
            "department": "finance",
            "env": "testing"
        },
        "logsPolicy": {
            "destination": "CLOUD_LOGGING"
        }
    }
    
  2. 运行以下命令:

    gcloud batch jobs submit example-container-job \
      --location us-central1 \
      --config hello-world-container.json
    

API

如需使用 Batch API 创建基本容器作业,请使用 jobs.create 方法。有关您可以为作业指定的所有字段的详情,请参见 projects.locations.jobs REST 资源的参考文档。

POST https://batch.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/jobs?job_id=JOB_NAME

{
    "taskGroups": [
        {
            "taskSpec": {
                "runnables": [
                    {
                        "container": {
                            CONTAINER
                        }
                    }
                ],
                "computeResource": {
                    "cpuMilli": CORES,
                    "memoryMib": MEMORY
                },
                "maxRetryCount": MAX_RETRY_COUNT,
                "maxRunDuration": "MAX_RUN_DURATION"
            },
            "taskCount": TASK_COUNT,
            "parallelism": PARALLELISM
        }
    ]
}

替换以下内容:

  • PROJECT_ID项目 ID 项目名称
  • LOCATION位置 作业的组成部分。
  • JOB_NAME:作业的名称。
  • CONTAINER:每个任务运行的容器。 容器必须至少在 imageUri 中指定映像 子字段,但可能还需要其他子字段。有关 请参阅 container 个子字段 以及本部分中的示例容器作业
  • CORES:可选。为每项任务分配的核心数量(具体而言是 vCPU,通常表示物理核心的一半),以毫 CPU 为单位。如果未指定 cpuMilli 字段,则将值设置为 迁移到 2000(2 个 vCPU)。
  • MEMORY:可选。要为每个任务分配的内存量(以 MB 为单位)。如果未指定 memoryMib 字段,则值将设置为 2000(2 GB)。
  • MAX_RETRY_COUNT:可选。数量上限 重试次数。该值必须是介于 0 之间的整数 和 10。如果未指定 maxRetryCount 字段,则值为 设置为 0,表示不会重试任务。 如需详细了解 maxRetryCount 字段,请参阅自动重试任务
  • MAX_RUN_DURATION:可选。最长时间 允许任务在重试或失败之前运行,格式为 一个以秒为单位的值,后跟 s,例如 3600s 表示 1 小时。如果未指定 maxRunDuration 字段,则采用此值 设为 作业的最长运行时间。 如需详细了解 maxRunDuration 字段,请参阅使用超时限制任务和可运行对象的运行时间
  • TASK_COUNT:可选。 作业,此值必须是介于 1每个任务组的任务数上限之间的整数。如果 未指定“taskCount”字段,则值设置为 1
  • PARALLELISM:可选。查询中 同时运行多个作业该数字不能大于 任务数量,并且必须是 1每个作业的并行任务数量上限之间的整数。如果 未指定“parallelism”字段,则值设置为 1

例如,如需创建使用 busybox Docker 容器映像运行任务的作业,请使用以下请求:

POST https://batch.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/jobs?job_id=example-container-job

{
    "taskGroups": [
        {
            "taskSpec": {
                "runnables": [
                    {
                        "container": {
                            "imageUri": "gcr.io/google-containers/busybox",
                            "entrypoint": "/bin/sh",
                            "commands": [
                                "-c",
                                "echo Hello world! This is task ${BATCH_TASK_INDEX}. This job has a total of ${BATCH_TASK_COUNT} tasks."
                            ]
                        }
                    }
                ],
                "computeResource": {
                    "cpuMilli": 2000,
                    "memoryMib": 16
                },
                "maxRetryCount": 2,
                "maxRunDuration": "3600s"
            },
            "taskCount": 4,
            "parallelism": 2
        }
    ],
    "allocationPolicy": {
        "instances": [
            {
                "policy": { "machineType": "e2-standard-4" }
            }
        ]
    },
    "labels": {
        "department": "finance",
        "env": "testing"
    },
    "logsPolicy": {
        "destination": "CLOUD_LOGGING"
    }
}

其中,PROJECT_ID项目 ID 项目名称

Go

Go

有关详情,请参阅 批处理 Go API 参考文档

如需向 Batch 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

import (
	"context"
	"fmt"
	"io"

	batch "cloud.google.com/go/batch/apiv1"
	"cloud.google.com/go/batch/apiv1/batchpb"
	durationpb "google.golang.org/protobuf/types/known/durationpb"
)

// Creates and runs a job that runs the specified container
func createContainerJob(w io.Writer, projectID, region, jobName string) error {
	// projectID := "your_project_id"
	// region := "us-central1"
	// jobName := "some-job"

	ctx := context.Background()
	batchClient, err := batch.NewClient(ctx)
	if err != nil {
		return fmt.Errorf("NewClient: %w", err)
	}
	defer batchClient.Close()

	container := &batchpb.Runnable_Container{
		ImageUri:   "gcr.io/google-containers/busybox",
		Commands:   []string{"-c", "echo Hello world! This is task ${BATCH_TASK_INDEX}. This job has a total of ${BATCH_TASK_COUNT} tasks."},
		Entrypoint: "/bin/sh",
	}

	// We can specify what resources are requested by each task.
	resources := &batchpb.ComputeResource{
		// CpuMilli is milliseconds per cpu-second. This means the task requires 2 whole CPUs.
		CpuMilli:  2000,
		MemoryMib: 16,
	}

	taskSpec := &batchpb.TaskSpec{
		Runnables: []*batchpb.Runnable{{
			Executable: &batchpb.Runnable_Container_{Container: container},
		}},
		ComputeResource: resources,
		MaxRunDuration: &durationpb.Duration{
			Seconds: 3600,
		},
		MaxRetryCount: 2,
	}

	// Tasks are grouped inside a job using TaskGroups.
	taskGroups := []*batchpb.TaskGroup{
		{
			TaskCount: 4,
			TaskSpec:  taskSpec,
		},
	}

	// Policies are used to define on what kind of virtual machines the tasks will run on.
	// In this case, we tell the system to use "e2-standard-4" machine type.
	// Read more about machine types here: https://cloud.google.com/compute/docs/machine-types
	allocationPolicy := &batchpb.AllocationPolicy{
		Instances: []*batchpb.AllocationPolicy_InstancePolicyOrTemplate{{
			PolicyTemplate: &batchpb.AllocationPolicy_InstancePolicyOrTemplate_Policy{
				Policy: &batchpb.AllocationPolicy_InstancePolicy{
					MachineType: "e2-standard-4",
				},
			},
		}},
	}

	// We use Cloud Logging as it's an out of the box available option
	logsPolicy := &batchpb.LogsPolicy{
		Destination: batchpb.LogsPolicy_CLOUD_LOGGING,
	}

	jobLabels := map[string]string{"env": "testing", "type": "container"}

	// The job's parent is the region in which the job will run
	parent := fmt.Sprintf("projects/%s/locations/%s", projectID, region)

	job := batchpb.Job{
		TaskGroups:       taskGroups,
		AllocationPolicy: allocationPolicy,
		Labels:           jobLabels,
		LogsPolicy:       logsPolicy,
	}

	req := &batchpb.CreateJobRequest{
		Parent: parent,
		JobId:  jobName,
		Job:    &job,
	}

	created_job, err := batchClient.CreateJob(ctx, req)
	if err != nil {
		return fmt.Errorf("unable to create job: %w", err)
	}

	fmt.Fprintf(w, "Job created: %v\n", created_job)

	return nil
}

Java

Java

有关详情,请参阅 批处理 Java API 参考文档

如需向 Batch 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

import com.google.cloud.batch.v1.AllocationPolicy;
import com.google.cloud.batch.v1.AllocationPolicy.InstancePolicy;
import com.google.cloud.batch.v1.AllocationPolicy.InstancePolicyOrTemplate;
import com.google.cloud.batch.v1.BatchServiceClient;
import com.google.cloud.batch.v1.ComputeResource;
import com.google.cloud.batch.v1.CreateJobRequest;
import com.google.cloud.batch.v1.Job;
import com.google.cloud.batch.v1.LogsPolicy;
import com.google.cloud.batch.v1.LogsPolicy.Destination;
import com.google.cloud.batch.v1.Runnable;
import com.google.cloud.batch.v1.Runnable.Container;
import com.google.cloud.batch.v1.TaskGroup;
import com.google.cloud.batch.v1.TaskSpec;
import com.google.protobuf.Duration;
import java.io.IOException;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;

public class CreateWithContainerNoMounting {

  public static void main(String[] args)
      throws IOException, ExecutionException, InterruptedException, TimeoutException {
    // TODO(developer): Replace these variables before running the sample.
    // Project ID or project number of the Cloud project you want to use.
    String projectId = "YOUR_PROJECT_ID";

    // Name of the region you want to use to run the job. Regions that are
    // available for Batch are listed on: https://cloud.google.com/batch/docs/get-started#locations
    String region = "europe-central2";

    // The name of the job that will be created.
    // It needs to be unique for each project and region pair.
    String jobName = "JOB_NAME";

    createContainerJob(projectId, region, jobName);
  }

  // This method shows how to create a sample Batch Job that will run a simple command inside a
  // container on Cloud Compute instances.
  public static void createContainerJob(String projectId, String region, String jobName)
      throws IOException, ExecutionException, InterruptedException, TimeoutException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the `batchServiceClient.close()` method on the client to safely
    // clean up any remaining background resources.
    try (BatchServiceClient batchServiceClient = BatchServiceClient.create()) {

      // Define what will be done as part of the job.
      Runnable runnable =
          Runnable.newBuilder()
              .setContainer(
                  Container.newBuilder()
                      .setImageUri("gcr.io/google-containers/busybox")
                      .setEntrypoint("/bin/sh")
                      .addCommands("-c")
                      .addCommands(
                          "echo Hello world! This is task ${BATCH_TASK_INDEX}. "
                              + "This job has a total of ${BATCH_TASK_COUNT} tasks.")
                      .build())
              .build();

      // We can specify what resources are requested by each task.
      ComputeResource computeResource =
          ComputeResource.newBuilder()
              // In milliseconds per cpu-second. This means the task requires 2 whole CPUs.
              .setCpuMilli(2000)
              // In MiB.
              .setMemoryMib(16)
              .build();

      TaskSpec task =
          TaskSpec.newBuilder()
              // Jobs can be divided into tasks. In this case, we have only one task.
              .addRunnables(runnable)
              .setComputeResource(computeResource)
              .setMaxRetryCount(2)
              .setMaxRunDuration(Duration.newBuilder().setSeconds(3600).build())
              .build();

      // Tasks are grouped inside a job using TaskGroups.
      // Currently, it's possible to have only one task group.
      TaskGroup taskGroup = TaskGroup.newBuilder().setTaskCount(4).setTaskSpec(task).build();

      // Policies are used to define on what kind of virtual machines the tasks will run on.
      // In this case, we tell the system to use "e2-standard-4" machine type.
      // Read more about machine types here: https://cloud.google.com/compute/docs/machine-types
      InstancePolicy instancePolicy =
          InstancePolicy.newBuilder().setMachineType("e2-standard-4").build();

      AllocationPolicy allocationPolicy =
          AllocationPolicy.newBuilder()
              .addInstances(InstancePolicyOrTemplate.newBuilder().setPolicy(instancePolicy).build())
              .build();

      Job job =
          Job.newBuilder()
              .addTaskGroups(taskGroup)
              .setAllocationPolicy(allocationPolicy)
              .putLabels("env", "testing")
              .putLabels("type", "container")
              // We use Cloud Logging as it's an out of the box available option.
              .setLogsPolicy(
                  LogsPolicy.newBuilder().setDestination(Destination.CLOUD_LOGGING).build())
              .build();

      CreateJobRequest createJobRequest =
          CreateJobRequest.newBuilder()
              // The job's parent is the region in which the job will run.
              .setParent(String.format("projects/%s/locations/%s", projectId, region))
              .setJob(job)
              .setJobId(jobName)
              .build();

      Job result =
          batchServiceClient
              .createJobCallable()
              .futureCall(createJobRequest)
              .get(5, TimeUnit.MINUTES);

      System.out.printf("Successfully created the job: %s", result.getName());
    }
  }
}

Node.js

Node.js

如需了解详情,请参阅 批处理 Node.js API 参考文档

如需向 Batch 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

/**
 * TODO(developer): Uncomment and replace these variables before running the sample.
 */
// const projectId = 'YOUR_PROJECT_ID';
/**
 * The region you want to the job to run in. The regions that support Batch are listed here:
 * https://cloud.google.com/batch/docs/get-started#locations
 */
// const region = 'us-central-1';
/**
 * The name of the job that will be created.
 * It needs to be unique for each project and region pair.
 */
// const jobName = 'YOUR_JOB_NAME';

// Imports the Batch library
const batchLib = require('@google-cloud/batch');
const batch = batchLib.protos.google.cloud.batch.v1;

// Instantiates a client
const batchClient = new batchLib.v1.BatchServiceClient();

// Define what will be done as part of the job.
const task = new batch.TaskSpec();
const runnable = new batch.Runnable();
runnable.container = new batch.Runnable.Container();
runnable.container.imageUri = 'gcr.io/google-containers/busybox';
runnable.container.entrypoint = '/bin/sh';
runnable.container.commands = [
  '-c',
  'echo Hello world! This is task ${BATCH_TASK_INDEX}. This job has a total of ${BATCH_TASK_COUNT} tasks.',
];
task.runnables = [runnable];

// We can specify what resources are requested by each task.
const resources = new batch.ComputeResource();
resources.cpuMilli = 2000; // in milliseconds per cpu-second. This means the task requires 2 whole CPUs.
resources.memoryMib = 16;
task.computeResource = resources;

task.maxRetryCount = 2;
task.maxRunDuration = {seconds: 3600};

// Tasks are grouped inside a job using TaskGroups.
const group = new batch.TaskGroup();
group.taskCount = 4;
group.taskSpec = task;

// Policies are used to define on what kind of virtual machines the tasks will run on.
// In this case, we tell the system to use "e2-standard-4" machine type.
// Read more about machine types here: https://cloud.google.com/compute/docs/machine-types
const allocationPolicy = new batch.AllocationPolicy();
const policy = new batch.AllocationPolicy.InstancePolicy();
policy.machineType = 'e2-standard-4';
const instances = new batch.AllocationPolicy.InstancePolicyOrTemplate();
instances.policy = policy;
allocationPolicy.instances = [instances];

const job = new batch.Job();
job.name = jobName;
job.taskGroups = [group];
job.allocationPolicy = allocationPolicy;
job.labels = {env: 'testing', type: 'container'};
// We use Cloud Logging as it's an option available out of the box
job.logsPolicy = new batch.LogsPolicy();
job.logsPolicy.destination = batch.LogsPolicy.Destination.CLOUD_LOGGING;

// The job's parent is the project and region in which the job will run
const parent = `projects/${projectId}/locations/${region}`;

async function callCreateJob() {
  // Construct request
  const request = {
    parent,
    jobId: jobName,
    job,
  };

  // Run request
  const response = await batchClient.createJob(request);
  console.log(response);
}

await callCreateJob();

Python

Python

有关详情,请参阅 批处理 Python API 参考文档

如需向 Batch 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

from google.cloud import batch_v1


def create_container_job(project_id: str, region: str, job_name: str) -> batch_v1.Job:
    """
    This method shows how to create a sample Batch Job that will run
    a simple command inside a container on Cloud Compute instances.

    Args:
        project_id: project ID or project number of the Cloud project you want to use.
        region: name of the region you want to use to run the job. Regions that are
            available for Batch are listed on: https://cloud.google.com/batch/docs/get-started#locations
        job_name: the name of the job that will be created.
            It needs to be unique for each project and region pair.

    Returns:
        A job object representing the job created.
    """
    client = batch_v1.BatchServiceClient()

    # Define what will be done as part of the job.
    runnable = batch_v1.Runnable()
    runnable.container = batch_v1.Runnable.Container()
    runnable.container.image_uri = "gcr.io/google-containers/busybox"
    runnable.container.entrypoint = "/bin/sh"
    runnable.container.commands = [
        "-c",
        "echo Hello world! This is task ${BATCH_TASK_INDEX}. This job has a total of ${BATCH_TASK_COUNT} tasks.",
    ]

    # Jobs can be divided into tasks. In this case, we have only one task.
    task = batch_v1.TaskSpec()
    task.runnables = [runnable]

    # We can specify what resources are requested by each task.
    resources = batch_v1.ComputeResource()
    resources.cpu_milli = 2000  # in milliseconds per cpu-second. This means the task requires 2 whole CPUs.
    resources.memory_mib = 16  # in MiB
    task.compute_resource = resources

    task.max_retry_count = 2
    task.max_run_duration = "3600s"

    # Tasks are grouped inside a job using TaskGroups.
    # Currently, it's possible to have only one task group.
    group = batch_v1.TaskGroup()
    group.task_count = 4
    group.task_spec = task

    # Policies are used to define on what kind of virtual machines the tasks will run on.
    # In this case, we tell the system to use "e2-standard-4" machine type.
    # Read more about machine types here: https://cloud.google.com/compute/docs/machine-types
    policy = batch_v1.AllocationPolicy.InstancePolicy()
    policy.machine_type = "e2-standard-4"
    instances = batch_v1.AllocationPolicy.InstancePolicyOrTemplate()
    instances.policy = policy
    allocation_policy = batch_v1.AllocationPolicy()
    allocation_policy.instances = [instances]

    job = batch_v1.Job()
    job.task_groups = [group]
    job.allocation_policy = allocation_policy
    job.labels = {"env": "testing", "type": "container"}
    # We use Cloud Logging as it's an out of the box available option
    job.logs_policy = batch_v1.LogsPolicy()
    job.logs_policy.destination = batch_v1.LogsPolicy.Destination.CLOUD_LOGGING

    create_request = batch_v1.CreateJobRequest()
    create_request.job = job
    create_request.job_id = job_name
    # The job's parent is the region in which the job will run
    create_request.parent = f"projects/{project_id}/locations/{region}"

    return client.create_job(create_request)

C++

C++

有关详情,请参阅 批处理 C++ API 参考文档

如需向 Batch 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

#include "google/cloud/batch/v1/batch_client.h"

  [](std::string const& project_id, std::string const& location_id,
     std::string const& job_id) {
    // Initialize the request; start with the fields that depend on the sample
    // input.
    google::cloud::batch::v1::CreateJobRequest request;
    request.set_parent("projects/" + project_id + "/locations/" + location_id);
    request.set_job_id(job_id);
    // Most of the job description is fixed in this example; use a string to
    // initialize it.
    auto constexpr kText = R"pb(
      task_groups {
        task_count: 4
        task_spec {
          compute_resource { cpu_milli: 500 memory_mib: 16 }
          max_retry_count: 2
          max_run_duration { seconds: 3600 }
          runnables {
            container {
              image_uri: "gcr.io/google-containers/busybox"
              entrypoint: "/bin/sh"
              commands: "-c"
              commands: "echo Hello world! This is task ${BATCH_TASK_INDEX}. This job has a total of ${BATCH_TASK_COUNT} tasks."
            }
          }
        }
      }
      allocation_policy {
        instances {
          policy { machine_type: "e2-standard-4" provisioning_model: STANDARD }
        }
      }
      labels { key: "env" value: "testing" }
      labels { key: "type" value: "container" }
      logs_policy { destination: CLOUD_LOGGING }
    )pb";
    auto* job = request.mutable_job();
    if (!google::protobuf::TextFormat::ParseFromString(kText, job)) {
      throw std::runtime_error("Error parsing Job description");
    }
    // Create a client and issue the request.
    auto client = google::cloud::batch_v1::BatchServiceClient(
        google::cloud::batch_v1::MakeBatchServiceConnection());
    auto response = client.CreateJob(request);
    if (!response) throw std::move(response).status();
    std::cout << "Job : " << response->DebugString() << "\n";
  }

创建基本脚本作业

您可以使用 Google Cloud 控制台、gcloud CLI、Batch API、Go、Java、Node.js、Python 或 C++ 创建基本脚本作业。

控制台

如需使用 Google Cloud 控制台创建基本脚本作业,请执行以下操作: 以下:

  1. 在 Google Cloud 控制台中,前往作业列表页面。

    前往“作业列表”

  2. 点击 创建。通过 此时会打开创建批量作业页面。在左侧窗格中, 作业详情页面处于选中状态。

  3. 配置作业详情页面:

    1. 可选:在作业名称字段中,自定义作业名称。

      例如,输入 example-basic-job

    2. 配置任务详情部分:

      1. 新建可运行对象窗口中,添加至少一个脚本 或容器来运行此作业。

        例如,如需添加一个脚本,请执行以下操作:

        1. 选择脚本。系统会显示一个文本框。

        2. 在文本框中,输入您要针对此作业中的每个任务运行的脚本。

          例如,输入以下脚本:

          echo Hello world! This is task ${BATCH_TASK_INDEX}. This job has a total of ${BATCH_TASK_COUNT} tasks.
          
        3. 点击完成

      2. 任务数字段中,输入此作业的任务数量。该值必须介于 1每个任务组的任务数上限之间的整数。

        例如,输入 4

      3. 并行数量字段中,输入要运行的任务数量 同时进行。该值不得大于任务总数,并且必须介于 1每个作业的并行任务数上限之间的整数。

        例如,输入 2

  4. 配置 Resource specifications 页面:

    1. 在左侧窗格中,点击资源规范。 系统随即会打开资源规范页面。

    2. 虚拟机预配模型部分中,选择 针对 预配模型 此作业的虚拟机:

      • 如果您的作业可以承受抢占,并且您希望 使用折扣虚拟机,请选择 Spot

      • 否则,请选择标准

      例如,选择标准(默认值)。

    3. 选择此作业的位置

      1. 区域字段中,选择一个区域。

        例如,选择 us-central1 (Iowa)(默认)。

      2. 可用区字段中,执行以下操作之一:

        • 如果您想限制此作业在 请选择区域。

        • 否则,请选择任意

        例如,选择任意(默认)。

    4. 请选择以下选项之一 机器家族

      • 对于常见工作负载,请点击通用

      • 对于需要高性能的工作负载,请点击计算优化

      • 对于内存密集型工作负载,请点击内存优化

      例如,点击通用(默认)。

    5. 系列字段中,选择一个 机器系列 此作业的虚拟机数量

      例如,如果您为广告系列选择了通用, 机器家族,选择 E2(默认)。

    6. 机器类型字段中,选择此机器类型 作业的虚拟机数量。

      例如,如果您为机器系列选择了 E2,请选择 e2-medium(2 个 vCPU、4 GB 内存)(默认)。

    7. 配置每项任务所需的虚拟机资源量:

      1. 核心字段中,输入 每个任务的 vCPUs 数量。

        例如,输入 1(默认值)。

      2. 内存字段中,输入每个任务的 RAM 容量(以 GB 为单位)。

        例如,输入 0.5(默认值)。

  5. 可选:如需查看作业配置,请在左侧窗格中 点击预览

  6. 点击创建

作业详情页面会显示您创建的作业。

gcloud

如需使用 gcloud CLI 创建基本脚本作业,请执行以下操作: 以下:

  1. 创建一个 JSON 文件,用于指定作业的配置详细信息。例如,如需创建基本脚本作业,请创建一个包含以下内容的 JSON 文件。如需详细了解您可以在 请参阅相关参考文档 projects.locations.jobs REST 资源

    {
        "taskGroups": [
            {
                "taskSpec": {
                    "runnables": [
                        {
                            "script": {
                                SCRIPT
                            }
                        }
                    ],
                    "computeResource": {
                        "cpuMilli": CORES,
                        "memoryMib": MEMORY
                    },
                    "maxRetryCount": MAX_RETRY_COUNT,
                    "maxRunDuration": "MAX_RUN_DURATION"
                },
                "taskCount": TASK_COUNT,
                "parallelism": PARALLELISM
            }
        ]
    }
    

    替换以下内容:

    • SCRIPT:每个任务运行的脚本。答 脚本必须使用 text 子字段定义为文本,或者 使用 file 子字段设置为可访问文件的路径。如需了解详情,请参阅本部分中的 script 子字段和示例脚本作业。
    • CORES:可选。您获得的 核心,具体来说就是 vCPUs, 通常代表半个物理核心 - 分配给每项任务 以 milliCPU 为单位。如果未指定 cpuMilli 字段,则采用此值 设置为 2000(2 个 vCPU)。
    • MEMORY:可选。 为每项任务分配的内存(以 MB 为单位)。如果 memoryMib 字段不是 则该值设置为 2000 (2 GB)。
    • MAX_RETRY_COUNT:可选。数量上限 重试次数。该值必须是介于 0 之间的整数 和10。如果未指定 maxRetryCount 字段,则值为 设置为 0,表示不会重试任务。 如需详细了解 maxRetryCount 字段,请参阅 自动重试任务
    • MAX_RUN_DURATION:可选。最长时间 允许任务在重试或失败之前运行,格式为 一个以秒为单位的值,后跟 s,例如 3600s 表示 1 小时。如果未指定 maxRunDuration 字段,则 值设置为 作业的最长运行时间。 如需详细了解 maxRunDuration 字段,请参阅 使用超时限制任务和可运行对象的运行时间
    • TASK_COUNT:可选。作业的任务数量。该值必须是介于 1每个任务组的任务数限制之间的整数。如果 未指定 taskCount 字段,则值设置为 1
    • PARALLELISM:可选。作业并发运行的任务数。该数字不能大于 任务数,并且必须是 1每个作业的并行任务数上限之间的整数。如果 未指定“parallelism”字段,则值设置为 1
  2. 使用 gcloud batch jobs submit 命令创建作业。

    gcloud batch jobs submit JOB_NAME \
      --location LOCATION \
      --config JSON_CONFIGURATION_FILE
    

    替换以下内容:

    • JOB_NAME:作业的名称。
    • LOCATION位置 作业的组成部分。
    • JSON_CONFIGURATION_FILE:JSON 路径 该文件包含作业的配置详情。

例如,如需创建使用脚本运行任务的作业,请执行以下操作:

  1. 在名为 hello-world-script.json,其中包含以下内容:

    {
        "taskGroups": [
            {
                "taskSpec": {
                    "runnables": [
                        {
                            "script": {
                                "text": "echo Hello world! This is task ${BATCH_TASK_INDEX}. This job has a total of ${BATCH_TASK_COUNT} tasks."
                            }
                        }
                    ],
                    "computeResource": {
                        "cpuMilli": 2000,
                        "memoryMib": 16
                    },
                    "maxRetryCount": 2,
                    "maxRunDuration": "3600s"
                },
                "taskCount": 4,
                "parallelism": 2
            }
        ],
        "allocationPolicy": {
            "instances": [
                {
                    "policy": { "machineType": "e2-standard-4" }
                }
            ]
        },
        "labels": {
            "department": "finance",
            "env": "testing"
        },
        "logsPolicy": {
            "destination": "CLOUD_LOGGING"
        }
    }
    
  2. 运行以下命令:

    gcloud batch jobs submit example-script-job \
      --location us-central1 \
      --config hello-world-script.json
    

API

要使用 Batch API 创建基本脚本作业,请使用 jobs.create 方法。 有关您可以为作业指定的所有字段的详情,请参见 projects.locations.jobs REST 资源的参考文档。

POST https://batch.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/jobs?job_id=JOB_NAME

{
    "taskGroups": [
        {
            "taskSpec": {
                "runnables": [
                    {
                        "script": {
                            SCRIPT
                        }
                    }
                ],
                "computeResource": {
                    "cpuMilli": CORES,
                    "memoryMib": MEMORY
                },
                "maxRetryCount": MAX_RETRY_COUNT,
                "maxRunDuration": "MAX_RUN_DURATION"
            },
            "taskCount": TASK_COUNT,
            "parallelism": PARALLELISM
        }
    ]
}

替换以下内容:

  • PROJECT_ID项目 ID 项目名称
  • LOCATION:作业的位置
  • JOB_NAME:作业的名称。
  • SCRIPT:每个任务运行的脚本。脚本必须使用 text 子字段定义为文本,或者使用 file 子字段定义为可访问文件的路径。有关 请参阅 script 个子字段 以及本部分中的示例脚本作业
  • CORES:可选。您获得的 核心,具体来说就是 vCPUs, 通常代表半个物理核心 - 在 CPU 中为每项任务分配 milliCPU 单位。如果未指定 cpuMilli 字段,则将值设为 2000(2 个 vCPU)。
  • MEMORY:可选。要为每个任务分配的内存量(以 MB 为单位)。如果未指定 memoryMib 字段,则值将设置为 2000(2 GB)。
  • MAX_RETRY_COUNT:可选。数量上限 重试次数。该值必须是介于 0 之间的整数 和 10。如果未指定 maxRetryCount 字段,则值为 设置为 0,表示不会重试任务。 如需详细了解 maxRetryCount 字段,请参阅自动重试任务
  • MAX_RUN_DURATION:可选。任务在重试或失败之前允许的运行时长上限,格式为以秒为单位的值后跟 s,例如 3600s 表示 1 小时。如果未指定 maxRunDuration 字段,则采用此值 设为 作业的最长运行时间。 如需详细了解 maxRunDuration 字段,请参阅 使用超时限制任务和可运行对象的运行时间
  • TASK_COUNT:可选。作业的任务数量。该值必须是介于 1每个任务组的任务数限制之间的整数。如果 未指定“taskCount”字段,则值设置为 1
  • PARALLELISM:可选。作业并发运行的任务数。该数字不能大于 任务数量,并且必须是 1每个作业的并行任务数量上限之间的整数。如果 未指定“parallelism”字段,则值设置为 1

例如,如需创建使用脚本运行任务的作业, 请使用以下请求:

POST https://batch.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/jobs?job_id=example-script-job

{
    "taskGroups": [
        {
            "taskSpec": {
                "runnables": [
                    {
                        "script": {
                            "text": "echo Hello world! This is task ${BATCH_TASK_INDEX}. This job has a total of ${BATCH_TASK_COUNT} tasks."
                        }
                    }
                ],
                "computeResource": {
                    "cpuMilli": 2000,
                    "memoryMib": 16
                },
                "maxRetryCount": 2,
                "maxRunDuration": "3600s"
            },
            "taskCount": 4,
            "parallelism": 2
        }
    ],
    "allocationPolicy": {
        "instances": [
            {
                "policy": { "machineType": "e2-standard-4" }
            }
        ]
    },
    "labels": {
        "department": "finance",
        "env": "testing"
    },
    "logsPolicy": {
        "destination": "CLOUD_LOGGING"
    }
}

其中,PROJECT_ID项目 ID 项目名称

Go

Go

有关详情,请参阅 批处理 Go API 参考文档

如需向 Batch 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

import (
	"context"
	"fmt"
	"io"

	batch "cloud.google.com/go/batch/apiv1"
	"cloud.google.com/go/batch/apiv1/batchpb"
	durationpb "google.golang.org/protobuf/types/known/durationpb"
)

// Creates and runs a job that executes the specified script
func createScriptJob(w io.Writer, projectID, region, jobName string) error {
	// projectID := "your_project_id"
	// region := "us-central1"
	// jobName := "some-job"

	ctx := context.Background()
	batchClient, err := batch.NewClient(ctx)
	if err != nil {
		return fmt.Errorf("NewClient: %w", err)
	}
	defer batchClient.Close()

	// Define what will be done as part of the job.
	command := &batchpb.Runnable_Script_Text{
		Text: "echo Hello world! This is task ${BATCH_TASK_INDEX}. This job has a total of ${BATCH_TASK_COUNT} tasks.",
	}
	// You can also run a script from a file. Just remember, that needs to be a script that's
	// already on the VM that will be running the job.
	// Using runnable.script.text and runnable.script.path is mutually exclusive.
	// command := &batchpb.Runnable_Script_Path{
	// 	Path: "/tmp/test.sh",
	// }

	// We can specify what resources are requested by each task.
	resources := &batchpb.ComputeResource{
		// CpuMilli is milliseconds per cpu-second. This means the task requires 2 whole CPUs.
		CpuMilli:  2000,
		MemoryMib: 16,
	}

	taskSpec := &batchpb.TaskSpec{
		Runnables: []*batchpb.Runnable{{
			Executable: &batchpb.Runnable_Script_{
				Script: &batchpb.Runnable_Script{Command: command},
			},
		}},
		ComputeResource: resources,
		MaxRunDuration: &durationpb.Duration{
			Seconds: 3600,
		},
		MaxRetryCount: 2,
	}

	// Tasks are grouped inside a job using TaskGroups.
	taskGroups := []*batchpb.TaskGroup{
		{
			TaskCount: 4,
			TaskSpec:  taskSpec,
		},
	}

	// Policies are used to define on what kind of virtual machines the tasks will run on.
	// In this case, we tell the system to use "e2-standard-4" machine type.
	// Read more about machine types here: https://cloud.google.com/compute/docs/machine-types
	allocationPolicy := &batchpb.AllocationPolicy{
		Instances: []*batchpb.AllocationPolicy_InstancePolicyOrTemplate{{
			PolicyTemplate: &batchpb.AllocationPolicy_InstancePolicyOrTemplate_Policy{
				Policy: &batchpb.AllocationPolicy_InstancePolicy{
					MachineType: "e2-standard-4",
				},
			},
		}},
	}

	// We use Cloud Logging as it's an out of the box available option
	logsPolicy := &batchpb.LogsPolicy{
		Destination: batchpb.LogsPolicy_CLOUD_LOGGING,
	}

	jobLabels := map[string]string{"env": "testing", "type": "script"}

	// The job's parent is the region in which the job will run
	parent := fmt.Sprintf("projects/%s/locations/%s", projectID, region)

	job := batchpb.Job{
		TaskGroups:       taskGroups,
		AllocationPolicy: allocationPolicy,
		Labels:           jobLabels,
		LogsPolicy:       logsPolicy,
	}

	req := &batchpb.CreateJobRequest{
		Parent: parent,
		JobId:  jobName,
		Job:    &job,
	}

	created_job, err := batchClient.CreateJob(ctx, req)
	if err != nil {
		return fmt.Errorf("unable to create job: %w", err)
	}

	fmt.Fprintf(w, "Job created: %v\n", created_job)

	return nil
}

Java

Java

有关详情,请参阅 批处理 Java API 参考文档

如需向 Batch 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

import com.google.cloud.batch.v1.AllocationPolicy;
import com.google.cloud.batch.v1.AllocationPolicy.InstancePolicy;
import com.google.cloud.batch.v1.AllocationPolicy.InstancePolicyOrTemplate;
import com.google.cloud.batch.v1.BatchServiceClient;
import com.google.cloud.batch.v1.ComputeResource;
import com.google.cloud.batch.v1.CreateJobRequest;
import com.google.cloud.batch.v1.Job;
import com.google.cloud.batch.v1.LogsPolicy;
import com.google.cloud.batch.v1.LogsPolicy.Destination;
import com.google.cloud.batch.v1.Runnable;
import com.google.cloud.batch.v1.Runnable.Script;
import com.google.cloud.batch.v1.TaskGroup;
import com.google.cloud.batch.v1.TaskSpec;
import com.google.protobuf.Duration;
import java.io.IOException;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;

public class CreateWithScriptNoMounting {

  public static void main(String[] args)
      throws IOException, ExecutionException, InterruptedException, TimeoutException {
    // TODO(developer): Replace these variables before running the sample.
    // Project ID or project number of the Cloud project you want to use.
    String projectId = "YOUR_PROJECT_ID";

    // Name of the region you want to use to run the job. Regions that are
    // available for Batch are listed on: https://cloud.google.com/batch/docs/get-started#locations
    String region = "europe-central2";

    // The name of the job that will be created.
    // It needs to be unique for each project and region pair.
    String jobName = "JOB_NAME";

    createScriptJob(projectId, region, jobName);
  }

  // This method shows how to create a sample Batch Job that will run
  // a simple command on Cloud Compute instances.
  public static void createScriptJob(String projectId, String region, String jobName)
      throws IOException, ExecutionException, InterruptedException, TimeoutException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the `batchServiceClient.close()` method on the client to safely
    // clean up any remaining background resources.
    try (BatchServiceClient batchServiceClient = BatchServiceClient.create()) {

      // Define what will be done as part of the job.
      Runnable runnable =
          Runnable.newBuilder()
              .setScript(
                  Script.newBuilder()
                      .setText(
                          "echo Hello world! This is task ${BATCH_TASK_INDEX}. "
                              + "This job has a total of ${BATCH_TASK_COUNT} tasks.")
                      // You can also run a script from a file. Just remember, that needs to be a
                      // script that's already on the VM that will be running the job.
                      // Using setText() and setPath() is mutually exclusive.
                      // .setPath("/tmp/test.sh")
                      .build())
              .build();

      // We can specify what resources are requested by each task.
      ComputeResource computeResource =
          ComputeResource.newBuilder()
              // In milliseconds per cpu-second. This means the task requires 2 whole CPUs.
              .setCpuMilli(2000)
              // In MiB.
              .setMemoryMib(16)
              .build();

      TaskSpec task =
          TaskSpec.newBuilder()
              // Jobs can be divided into tasks. In this case, we have only one task.
              .addRunnables(runnable)
              .setComputeResource(computeResource)
              .setMaxRetryCount(2)
              .setMaxRunDuration(Duration.newBuilder().setSeconds(3600).build())
              .build();

      // Tasks are grouped inside a job using TaskGroups.
      // Currently, it's possible to have only one task group.
      TaskGroup taskGroup = TaskGroup.newBuilder().setTaskCount(4).setTaskSpec(task).build();

      // Policies are used to define on what kind of virtual machines the tasks will run on.
      // In this case, we tell the system to use "e2-standard-4" machine type.
      // Read more about machine types here: https://cloud.google.com/compute/docs/machine-types
      InstancePolicy instancePolicy =
          InstancePolicy.newBuilder().setMachineType("e2-standard-4").build();

      AllocationPolicy allocationPolicy =
          AllocationPolicy.newBuilder()
              .addInstances(InstancePolicyOrTemplate.newBuilder().setPolicy(instancePolicy).build())
              .build();

      Job job =
          Job.newBuilder()
              .addTaskGroups(taskGroup)
              .setAllocationPolicy(allocationPolicy)
              .putLabels("env", "testing")
              .putLabels("type", "script")
              // We use Cloud Logging as it's an out of the box available option.
              .setLogsPolicy(
                  LogsPolicy.newBuilder().setDestination(Destination.CLOUD_LOGGING).build())
              .build();

      CreateJobRequest createJobRequest =
          CreateJobRequest.newBuilder()
              // The job's parent is the region in which the job will run.
              .setParent(String.format("projects/%s/locations/%s", projectId, region))
              .setJob(job)
              .setJobId(jobName)
              .build();

      Job result =
          batchServiceClient
              .createJobCallable()
              .futureCall(createJobRequest)
              .get(5, TimeUnit.MINUTES);

      System.out.printf("Successfully created the job: %s", result.getName());
    }
  }
}

Node.js

Node.js

如需了解详情,请参阅 批处理 Node.js API 参考文档

如需向 Batch 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

/**
 * TODO(developer): Uncomment and replace these variables before running the sample.
 */
// const projectId = 'YOUR_PROJECT_ID';
/**
 * The region you want to the job to run in. The regions that support Batch are listed here:
 * https://cloud.google.com/batch/docs/get-started#locations
 */
// const region = 'us-central-1';
/**
 * The name of the job that will be created.
 * It needs to be unique for each project and region pair.
 */
// const jobName = 'YOUR_JOB_NAME';

// Imports the Batch library
const batchLib = require('@google-cloud/batch');
const batch = batchLib.protos.google.cloud.batch.v1;

// Instantiates a client
const batchClient = new batchLib.v1.BatchServiceClient();

// Define what will be done as part of the job.
const task = new batch.TaskSpec();
const runnable = new batch.Runnable();
runnable.script = new batch.Runnable.Script();
runnable.script.text =
  'echo Hello world! This is task ${BATCH_TASK_INDEX}. This job has a total of ${BATCH_TASK_COUNT} tasks.';
// You can also run a script from a file. Just remember, that needs to be a script that's
// already on the VM that will be running the job. Using runnable.script.text and runnable.script.path is mutually
// exclusive.
// runnable.script.path = '/tmp/test.sh'
task.runnables = [runnable];

// We can specify what resources are requested by each task.
const resources = new batch.ComputeResource();
resources.cpuMilli = 2000; // in milliseconds per cpu-second. This means the task requires 2 whole CPUs.
resources.memoryMib = 16;
task.computeResource = resources;

task.maxRetryCount = 2;
task.maxRunDuration = {seconds: 3600};

// Tasks are grouped inside a job using TaskGroups.
const group = new batch.TaskGroup();
group.taskCount = 4;
group.taskSpec = task;

// Policies are used to define on what kind of virtual machines the tasks will run on.
// In this case, we tell the system to use "e2-standard-4" machine type.
// Read more about machine types here: https://cloud.google.com/compute/docs/machine-types
const allocationPolicy = new batch.AllocationPolicy();
const policy = new batch.AllocationPolicy.InstancePolicy();
policy.machineType = 'e2-standard-4';
const instances = new batch.AllocationPolicy.InstancePolicyOrTemplate();
instances.policy = policy;
allocationPolicy.instances = [instances];

const job = new batch.Job();
job.name = jobName;
job.taskGroups = [group];
job.allocationPolicy = allocationPolicy;
job.labels = {env: 'testing', type: 'script'};
// We use Cloud Logging as it's an option available out of the box
job.logsPolicy = new batch.LogsPolicy();
job.logsPolicy.destination = batch.LogsPolicy.Destination.CLOUD_LOGGING;

// The job's parent is the project and region in which the job will run
const parent = `projects/${projectId}/locations/${region}`;

async function callCreateJob() {
  // Construct request
  const request = {
    parent,
    jobId: jobName,
    job,
  };

  // Run request
  const response = await batchClient.createJob(request);
  console.log(response);
}

await callCreateJob();

Python

Python

有关详情,请参阅 批处理 Python API 参考文档

如需向 Batch 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

from google.cloud import batch_v1


def create_script_job(project_id: str, region: str, job_name: str) -> batch_v1.Job:
    """
    This method shows how to create a sample Batch Job that will run
    a simple command on Cloud Compute instances.

    Args:
        project_id: project ID or project number of the Cloud project you want to use.
        region: name of the region you want to use to run the job. Regions that are
            available for Batch are listed on: https://cloud.google.com/batch/docs/get-started#locations
        job_name: the name of the job that will be created.
            It needs to be unique for each project and region pair.

    Returns:
        A job object representing the job created.
    """
    client = batch_v1.BatchServiceClient()

    # Define what will be done as part of the job.
    task = batch_v1.TaskSpec()
    runnable = batch_v1.Runnable()
    runnable.script = batch_v1.Runnable.Script()
    runnable.script.text = "echo Hello world! This is task ${BATCH_TASK_INDEX}. This job has a total of ${BATCH_TASK_COUNT} tasks."
    # You can also run a script from a file. Just remember, that needs to be a script that's
    # already on the VM that will be running the job. Using runnable.script.text and runnable.script.path is mutually
    # exclusive.
    # runnable.script.path = '/tmp/test.sh'
    task.runnables = [runnable]

    # We can specify what resources are requested by each task.
    resources = batch_v1.ComputeResource()
    resources.cpu_milli = 2000  # in milliseconds per cpu-second. This means the task requires 2 whole CPUs.
    resources.memory_mib = 16
    task.compute_resource = resources

    task.max_retry_count = 2
    task.max_run_duration = "3600s"

    # Tasks are grouped inside a job using TaskGroups.
    # Currently, it's possible to have only one task group.
    group = batch_v1.TaskGroup()
    group.task_count = 4
    group.task_spec = task

    # Policies are used to define on what kind of virtual machines the tasks will run on.
    # In this case, we tell the system to use "e2-standard-4" machine type.
    # Read more about machine types here: https://cloud.google.com/compute/docs/machine-types
    allocation_policy = batch_v1.AllocationPolicy()
    policy = batch_v1.AllocationPolicy.InstancePolicy()
    policy.machine_type = "e2-standard-4"
    instances = batch_v1.AllocationPolicy.InstancePolicyOrTemplate()
    instances.policy = policy
    allocation_policy.instances = [instances]

    job = batch_v1.Job()
    job.task_groups = [group]
    job.allocation_policy = allocation_policy
    job.labels = {"env": "testing", "type": "script"}
    # We use Cloud Logging as it's an out of the box available option
    job.logs_policy = batch_v1.LogsPolicy()
    job.logs_policy.destination = batch_v1.LogsPolicy.Destination.CLOUD_LOGGING

    create_request = batch_v1.CreateJobRequest()
    create_request.job = job
    create_request.job_id = job_name
    # The job's parent is the region in which the job will run
    create_request.parent = f"projects/{project_id}/locations/{region}"

    return client.create_job(create_request)

C++

C++

如需了解详情,请参阅 批处理 C++ API 参考文档

如需向 Batch 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

#include "google/cloud/batch/v1/batch_client.h"

  [](std::string const& project_id, std::string const& location_id,
     std::string const& job_id) {
    // Initialize the request; start with the fields that depend on the sample
    // input.
    google::cloud::batch::v1::CreateJobRequest request;
    request.set_parent("projects/" + project_id + "/locations/" + location_id);
    request.set_job_id(job_id);
    // Most of the job description is fixed in this example; use a string to
    // initialize it.
    auto constexpr kText = R"pb(
      task_groups {
        task_count: 4
        task_spec {
          compute_resource { cpu_milli: 500 memory_mib: 16 }
          max_retry_count: 2
          max_run_duration { seconds: 3600 }
          runnables {
            script {
              text: "echo Hello world! This is task ${BATCH_TASK_INDEX}. This job has a total of ${BATCH_TASK_COUNT} tasks."
            }
          }
        }
      }
      allocation_policy {
        instances {
          policy { machine_type: "e2-standard-4" provisioning_model: STANDARD }
        }
      }
      labels { key: "env" value: "testing" }
      labels { key: "type" value: "script" }
      logs_policy { destination: CLOUD_LOGGING }
    )pb";
    auto* job = request.mutable_job();
    if (!google::protobuf::TextFormat::ParseFromString(kText, job)) {
      throw std::runtime_error("Error parsing Job description");
    }
    // Create a client and issue the request.
    auto client = google::cloud::batch_v1::BatchServiceClient(
        google::cloud::batch_v1::MakeBatchServiceConnection());
    auto response = client.CreateJob(request);
    if (!response) throw std::move(response).status();
    std::cout << "Job : " << response->DebugString() << "\n";
  }

使用环境变量

使用环境变量 您需要编写您希望作业运行的容器映像或脚本。 您可以使用 批量作业以及 指定的名称

使用预定义的环境变量

默认情况下,作业中的可运行作业可以使用以下预定义环境变量:

  • BATCH_TASK_COUNT:此任务组中的任务总数。
  • BATCH_TASK_INDEX:此任务在任务组中的索引编号。第一个任务的索引为 0,并且随着 每个额外的任务。
  • BATCH_HOSTS_FILE:列出所有正在运行的虚拟机的文件的路径 此任务组中的实例。要使用此环境变量, requireHostsFile 字段 必须设置为 true
  • BATCH_TASK_RETRY_ATTEMPT:此任务已经执行的次数 错误。第一次尝试任务期间,值为 0 每次重试时都会递增。 任务允许的重试总次数由 maxRetryCount 字段,如果未定义,则为 0。 如需详细了解重试,请参阅 自动重试任务

有关如何使用预定义环境变量的示例,请参阅 上一个创建基本作业中的可运行对象示例 。

定义和使用自定义环境变量

(可选)您可以在作业中定义一个或多个自定义环境变量。

您可以根据数据的预期范围,在特定环境中定义每个变量:

在所选环境中,您可以定义每个资源的名称和值 变量:

你可以定义和使用自定义环境变量 使用 gcloud CLI 或 Batch API 创建作业。 以下示例说明了如何创建两个作业,这些作业分别定义 使用标准变量。 第一个示例作业有一个用于特定可运行对象的变量。 第二个示例作业有一个数组变量, 每个任务的值各不相同。

gcloud

如果您要定义一个将环境变量传递给 可运行,请参阅示例了解如何 为可运行对象定义和使用环境变量。 否则,如果要定义传递环境列表的作业 变量分配给不同任务,请参阅 如何为每个任务定义和使用环境变量

为可运行对象定义和使用环境变量

要创建使用 gcloud CLI,请使用 gcloud batch jobs submit 命令 并在该作业的配置文件中指定环境变量。

例如,创建定义环境变量的脚本作业 并将其传递给 3 个任务的脚本,请发出以下请求:

  1. 在名为 hello-world-environment-variables.json,其中包含以下内容:

    {
        "taskGroups": [
            {
                "taskSpec": {
                    "runnables": [
                        {
                            "script": {
                                "text": "echo Hello ${VARIABLE_NAME}! This is task ${BATCH_TASK_INDEX}. This job has a total of ${BATCH_TASK_COUNT} tasks."
                            },
                            "environment": {
                                "variables": {
                                    "VARIABLE_NAME": "VARIABLE_VALUE"
                                }
                            }
                        }
                    ],
                    "computeResource": {
                        "cpuMilli": 2000,
                        "memoryMib": 16
                    }
                },
                "taskCount": 3,
                "parallelism": 1
            }
        ],
        "allocationPolicy": {
            "instances": [
                {
                    "policy": {
                        "machineType": "e2-standard-4"
                    }
                }
            ]
        }
    }
    

    替换以下内容:

    • VARIABLE_NAME环境变量 传递给每个任务。按照惯例,环境变量 名称均采用大写
    • VARIABLE_VALUE:可选。传递给每个任务的环境变量的值。
  2. 运行以下命令:

    gcloud batch jobs submit example-environment-variables-job \
      --location us-central1 \
      --config hello-world-environment-variables.json
    

为每个任务定义和使用环境变量

创建一个作业,根据任务将环境变量传递给任务 使用 gcloud CLI,请使用 gcloud batch jobs submit 命令 并在作业的配置中指定 taskEnvironments 数组字段 文件。

例如,若要创建一个作业,其中包含一个包含 3 个环境变量的数组,这些环境变量具有匹配的名称和不同的值,并将这些环境变量传递给任务的脚本(其索引与数组中环境变量的索引匹配),请执行以下操作:

  1. 在名为 hello-world-task-environment-variables.json 替换为以下内容 内容:

    {
        "taskGroups": [
            {
                "taskSpec": {
                    "runnables": [
                        {
                            "script": {
                                "text": "echo Hello ${TASK_VARIABLE_NAME}! This is task ${BATCH_TASK_INDEX}. This job has a total of ${BATCH_TASK_COUNT} tasks."
                            },
                        }
                    ],
                    "computeResource": {
                        "cpuMilli": 2000,
                        "memoryMib": 16
                    }
                },
                "taskCount": 3,
                "taskEnvironments": [
                    {
                        "variables": {
                            "TASK_VARIABLE_NAME": "TASK_VARIABLE_VALUE_0"
                        }
                    },
                    {
                        "variables": {
                            "TASK_VARIABLE_NAME": "TASK_VARIABLE_VALUE_1"
                        }
                    },
                    {
                        "variables": {
                            "TASK_VARIABLE_NAME": "TASK_VARIABLE_VALUE_2"
                        }
                    }
                ]
            }
        ],
        "allocationPolicy": {
            "instances": [
                {
                    "policy": {
                        "machineType": "e2-standard-4"
                    }
                }
            ]
        }
    }
    

    替换以下内容:

    • TASK_VARIABLE_NAME任务环境变量 传递给具有匹配索引的任务。修改者 按照惯例,环境变量名称为 大写
    • TASK_VARIABLE_VALUE_0: 环境变量传递给第一个任务, BATCH_TASK_INDEX 等于 0
    • TASK_VARIABLE_VALUE_1: 环境变量传递给第二个任务, BATCH_TASK_INDEX 等于 1
    • TASK_VARIABLE_VALUE_2: 环境变量传递给第三个任务 BATCH_TASK_INDEX 等于 2
  2. 运行以下命令:

    gcloud batch jobs submit example-task-environment-variables-job \
      --location us-central1 \
      --config hello-world-task-environment-variables.json
    

API

如果您要定义一个将环境变量传递给 可运行,请参阅示例了解如何 为可运行对象定义和使用环境变量。 否则,如果要定义传递环境列表的作业 变量分配给不同任务,请参阅 如何为每个任务定义和使用环境变量

为可运行对象定义和使用环境变量

要创建使用以下代码将环境变量传递给可运行对象的作业: 批处理 API,请使用 gcloud batch jobs submit 命令 并在 environment 字段中指定环境变量。

例如,要创建包含环境变量和 传递给 3 个任务的脚本,请发出以下请求:

POST https://batch.googleapis.com/v1/projects/<var>PROJECT_ID</var>/locations/us-central1/jobs?job_id=example-environment-variables-job

{
    "taskGroups": [
        {
            "taskSpec": {
                "runnables": [
                    {
                        "script": {
                            "text": "echo Hello ${VARIABLE_NAME}! This is task ${BATCH_TASK_INDEX}. This job has a total of ${BATCH_TASK_COUNT} tasks."
                        },
                        "environment": {
                            "variables": {
                                "VARIABLE_NAME": "VARIABLE_VALUE"
                            }
                        }
                    }
                ],
                "computeResource": {
                    "cpuMilli": 2000,
                    "memoryMib": 16
                }
            },
            "taskCount": 3,
            "parallelism": 1
        }

    ],
    "allocationPolicy": {
        "instances": [
            {
                "policy": {
                    "machineType": "e2-standard-4"
                }
            }
        ]
    }
}

替换以下内容:

  • PROJECT_ID:您的项目的项目 ID
  • VARIABLE_NAME环境 变量传递给每个任务。按照惯例,环境变量名称应采用大写形式。
  • VARIABLE_VALUE:环境的值 变量传递给每个任务。

为每个任务定义和使用环境变量

创建一个作业,根据任务将环境变量传递给任务 使用 Batch API 创建索引,请使用 jobs.create 方法 并在 taskEnvironments 数组字段中指定环境变量。

例如,要创建包含 3 个环境的数组的作业, 具有匹配名称和不同值的变量,并将 环境变量传递给 3 个任务的脚本。 请提出以下请求:

POST https://batch.googleapis.com/v1/projects/<var>PROJECT_ID</var>/locations/us-central1/jobs?job_id=example-task-environment-variables-job

{
    "taskGroups": [
        {
            "taskSpec": {
                "runnables": [
                    {
                        "script": {
                            "text": "echo Hello ${TASK_VARIABLE_NAME}! This is task ${BATCH_TASK_INDEX}. This job has a total of ${BATCH_TASK_COUNT} tasks."
                        },
                    }
                ],
                "computeResource": {
                    "cpuMilli": 2000,
                    "memoryMib": 16
                }
            },
            "taskCount": 3,
            "taskEnvironments": [
                {
                    "variables": {
                        "TASK_VARIABLE_NAME": "TASK_VARIABLE_VALUE_0"
                    }
                },
                {
                    "variables": {
                        "TASK_VARIABLE_NAME": "TASK_VARIABLE_VALUE_1"
                    }
                },
                {
                    "variables": {
                        "TASK_VARIABLE_NAME": "TASK_VARIABLE_VALUE_2"
                    }
                }
            ]
        }
    ],
    "allocationPolicy": {
        "instances": [
            {
                "policy": { "machineType": "e2-standard-4" }
            }
        ]
    }
}

替换以下内容:

  • PROJECT_ID项目 ID 项目名称
  • TASK_VARIABLE_NAME:传递给具有匹配索引的任务的环境变量的名称。按照惯例,环境变量名称应采用大写形式。
  • TASK_VARIABLE_VALUE_0: 环境变量传递给第一个任务, BATCH_TASK_INDEX 等于 0
  • TASK_VARIABLE_VALUE_1: 环境变量传递给第二个任务, BATCH_TASK_INDEX 等于 1
  • TASK_VARIABLE_VALUE_2: 环境变量传递给第三个任务 BATCH_TASK_INDEX 等于 2

后续步骤