创建和运行使用存储卷的作业

本文档介绍了如何创建和运行使用一个或多个外部存储卷的批处理作业。外部存储选项包括新的或现有的永久性磁盘、新的本地 SSD、现有的 Cloud Storage 存储分区,以及现有的网络文件系统 (NFS),例如 Filestore 文件共享。

无论您是否添加外部存储卷,作业的每个 Compute Engine 虚拟机都有一个启动磁盘,该磁盘为作业的操作系统 (OS) 映像和指令提供存储空间。如需了解如何为作业配置启动磁盘,请改为参阅虚拟机操作系统环境概览

准备工作

  1. 如果您之前未使用过批处理功能,请参阅开始使用批处理,并完成适用于项目和用户的前提条件,以启用批处理功能。
  2. 如需获得创建作业所需的权限,请让您的管理员为您授予以下 IAM 角色:

    如需详细了解如何授予角色,请参阅管理对项目、文件夹和组织的访问权限

    您也可以通过自定义角色或其他预定义角色来获取所需的权限。

创建使用存储卷的作业

作业可以选择使用以下各类外部存储卷中的一种或多种。如需详细了解所有类型的存储卷以及每种卷的差异和限制,请参阅 Compute Engine 虚拟机存储选项文档。

您可以允许作业使用每个存储卷,方法是在作业的定义中添加该卷,并在可运行文件中指定其挂载路径 (mountPath)。如需了解如何创建使用存储卷的作业,请参阅以下一个或多个部分:

使用永久性磁盘

使用永久性磁盘的作业存在以下限制:

  • 所有永久性磁盘:查看所有永久性磁盘的限制

  • 新建永久性磁盘与现有永久性磁盘:作业中的每个永久性磁盘都可以是新建的(在作业中定义并随作业一起创建的),也可以是现有的(已在项目中创建并在作业中指定的)。如需使用永久性磁盘,需要对其进行格式化和挂载到作业的虚拟机,且虚拟机必须位于永久性磁盘所在的位置。批量挂载您在作业中添加的所有永久性磁盘,并格式化所有新的永久性磁盘,但您必须格式化并卸载您希望作业使用的所有现有永久性磁盘。

    支持的位置选项格式选项挂载选项因新永久性磁盘和现有永久性磁盘而异,如下表所示:

    新的永久性磁盘 现有的永久性磁盘
    格式选项

    永久性磁盘会自动采用 ext4 文件系统格式化

    您必须先格式化永久性磁盘以使用 ext4 文件系统,然后才能将其用于作业。

    装载选项

    支持所有选项

    除写入以外的所有选项均受支持。这是由于多写入者模式的限制所致。

    您必须先从永久性磁盘已附加的所有虚拟机中分离永久性磁盘,然后才能将其用于作业。

    位置选项

    您只能创建可用区永久性磁盘

    您可以为作业选择任何地点。永久性磁盘会在项目运行所在的区域中创建。

    您可以选择可用区级和区域级永久性磁盘


    您必须将作业的位置(或仅作业的允许的位置,如果已指定)设置为仅包含作业的所有永久性磁盘的位置。例如,对于可用区级永久性磁盘,作业的存储位置必须是磁盘所在的可用区;对于区域级永久性磁盘,作业的存储位置必须是磁盘所在的区域,或者(如果指定了可用区)是区域级永久性磁盘所在的一个或两个特定可用区。

  • 实例模板:如果您想在创建此作业时使用虚拟机实例模板,则必须在实例模板中为此作业附加任何永久性磁盘。否则,如果您不想使用实例模板,则必须直接在作业定义中附加任何永久性磁盘。

您可以使用 Google Cloud 控制台、gcloud CLI、Batch API、C++、Go、Java、Node.js 或 Python 创建使用永久性磁盘的作业。

控制台

以下示例使用 Google Cloud 控制台创建一个作业,该作业会运行脚本,从位于 us-central1-a 可用区中的现有可用区级永久性磁盘读取文件。示例脚本假定作业有一个现有的地区永久性磁盘,其中根目录中包含一个名为 example.txt 的文本文件。

可选:创建示例区域永久性磁盘

如果您想创建可用于运行示例脚本的可用区级永久性磁盘,请在创建作业之前执行以下操作:

  1. 将名为 example-disk 的新空白永久性磁盘挂接到 us-central1-a 可用区中的 Linux 虚拟机,然后在虚拟机上运行命令以格式化和装载磁盘。如需查看相关说明,请参阅向虚拟机添加永久性磁盘

    请勿断开与虚拟机的连接。

  2. 如需在永久性磁盘上创建 example.txt,请在虚拟机上运行以下命令:

    1. 如需将当前工作目录更改为永久性磁盘的根目录,请输入以下命令:

      cd VM_MOUNT_PATH
      

      VM_MOUNT_PATH 替换为上一步中将永久性磁盘挂载到此虚拟机的目录的路径,例如 /mnt/disks/example-disk

    2. Enter 键。

    3. 如需创建并定义名为 example.txt 的文件,请输入以下命令:

      cat > example.txt
      
    4. Enter 键。

    5. 输入文件内容。例如,输入 Hello world!

    6. 如需保存文件,请按 Ctrl+D(在 macOS 上,按 Command+D)。

    完成后,您可以断开与虚拟机的连接。

  3. 从虚拟机分离永久性磁盘。

    • 如果您不再需要该虚拟机,可以删除虚拟机,系统会自动分离永久性磁盘。

    • 否则,请分离永久性磁盘。如需查看相关说明,请参阅分离和重新挂接启动磁盘,然后分离 example-disk 永久性磁盘(而非虚拟机的启动磁盘)。

创建使用现有区域永久性磁盘的作业

如需使用 Google Cloud 控制台创建使用现有可用区永久性磁盘的作业,请执行以下操作:

  1. 在 Google Cloud 控制台中,前往 Job list(作业列表)页面。

    前往“作业列表”

  2. 点击 创建。系统随即会打开创建批处理作业页面。在左侧窗格中,选择作业详情页面。

  3. 配置作业详情页面:

    1. 可选:在作业名称字段中,自定义作业名称。

      例如,输入 example-disk-job

    2. 配置任务详情部分:

      1. 新建可运行对象窗口中,添加至少一个脚本或容器以便此作业运行。

        例如,如需运行一个脚本来输出名为 example.txt 且位于此作业使用的永久性磁盘的根目录中的文件的内容,请执行以下操作:

        1. 选中脚本复选框。系统会显示一个文本框。

        2. 在文本框中,输入以下脚本:

          echo "Here is the content of the example.txt file in the persistent disk."
          cat MOUNT_PATH/example.txt
          

          MOUNT_PATH 替换为您计划将永久性磁盘挂载到此作业的虚拟机的路径,例如 /mnt/disks/example-disk

        3. 点击完成

      2. 任务数字段中,输入此作业的任务数量。

        例如,输入 1(默认值)。

      3. 并行性字段中,输入要并发运行的任务数量。

        例如,输入 1(默认值)。

  4. 配置 Resource specifications 页面:

    1. 在左侧窗格中,点击资源规范。 系统随即会打开资源规范页面。

    2. 选择此作业的位置。 如需使用现有的可用区级永久性磁盘,作业的虚拟机必须位于同一可用区。

      1. 区域字段中,选择一个区域。

        例如,如需使用示例可用区级永久性磁盘,请选择 us-central1 (Iowa)(默认值)。

      2. 可用区字段中,选择一个可用区。

        例如,选择 us-central1-a (Iowa)

  5. 配置 Additional configurations(其他配置)页面:

    1. 在左侧窗格中,点击其他配置。 系统会打开其他配置页面。

    2. 对于您要挂载到此作业的每个现有可用区级永久性磁盘,请执行以下操作:

      1. 存储卷部分中,点击添加新卷。系统随即会显示新建卷窗口。

      2. 新建卷窗口中,执行以下操作:

        1. 卷类型部分,选择永久性磁盘(默认)。

        2. 磁盘列表中,选择要挂载到此作业的现有可用区级永久性磁盘。磁盘必须与此作业位于同一可用区。

          例如,选择您准备好的现有可用区永久性磁盘,该磁盘位于 us-central1-a 可用区中,并包含文件 example.txt

        3. 可选:如果您想重命名此区域性永久性磁盘,请执行以下操作:

          1. 选择自定义设备名称

          2. 设备名称字段中,输入磁盘的新名称。

        4. 装载路径字段中,输入此永久性磁盘的装载路径 (MOUNT_PATH):

          例如,输入以下内容:

          /mnt/disks/EXISTING_PERSISTENT_DISK_NAME
          

          EXISTING_PERSISTENT_DISK_NAME 替换为磁盘的名称。如果您重命名了区域性永久性磁盘,请使用新名称。

          例如,将 EXISTING_PERSISTENT_DISK_NAME 替换为 example-disk

        5. 点击完成

  6. 可选:配置此作业的其他字段

  7. 可选:如需查看作业配置,请在左侧窗格中点击预览

  8. 点击创建

作业详情页面会显示您创建的作业。

gcloud

以下示例使用 gcloud CLI 创建了一个作业,用于挂接和装载现有永久性磁盘和新的永久性磁盘。该作业包含 3 个任务,每个任务都会运行一个脚本,以便在名为 output_task_TASK_INDEX.txt 的新永久性磁盘中创建文件,其中 TASK_INDEX 是每个任务的编号:0、1 和 2。

如需使用 gcloud CLI 创建使用永久性磁盘的作业,请使用 gcloud batch jobs submit 命令。在作业的 JSON 配置文件中,在 instances 字段中指定永久性磁盘,并在 volumes 字段中装载永久性磁盘。

  1. 创建一个 JSON 文件。

    • 如果您不为此作业使用实例模板,请创建一个 JSON 文件,其中包含以下内容:

      {
          "allocationPolicy": {
              "instances": [
                  {
                      "policy": {
                          "disks": [
                              {
                                  "deviceName": "EXISTING_PERSISTENT_DISK_NAME",
                                  "existingDisk": "projects/PROJECT_ID/EXISTING_PERSISTENT_DISK_LOCATION/disks/EXISTING_PERSISTENT_DISK_NAME"
                              },
                              {
                                  "newDisk": {
                                      "sizeGb": NEW_PERSISTENT_DISK_SIZE,
                                      "type": "NEW_PERSISTENT_DISK_TYPE"
                                  },
                                  "deviceName": "NEW_PERSISTENT_DISK_NAME"
                              }
                          ]
                      }
                  }
              ],
              "location": {
                  "allowedLocations": [
                      "EXISTING_PERSISTENT_DISK_LOCATION"
                  ]
              }
          },
          "taskGroups": [
              {
                  "taskSpec": {
                      "runnables": [
                          {
                              "script": {
                                  "text": "echo Hello world from task ${BATCH_TASK_INDEX}. >> /mnt/disks/NEW_PERSISTENT_DISK_NAME/output_task_${BATCH_TASK_INDEX}.txt"
                              }
                          }
                      ],
                      "volumes": [
                          {
                              "deviceName": "NEW_PERSISTENT_DISK_NAME",
                              "mountPath": "/mnt/disks/NEW_PERSISTENT_DISK_NAME",
                              "mountOptions": "rw,async"
                          },
                          {
      
                              "deviceName": "EXISTING_PERSISTENT_DISK_NAME",
                              "mountPath": "/mnt/disks/EXISTING_PERSISTENT_DISK_NAME"
                          }
                      ]
                  },
                  "taskCount":3
              }
          ],
          "logsPolicy": {
              "destination": "CLOUD_LOGGING"
          }
      }
      

      替换以下内容:

      • PROJECT_ID:您的项目的项目 ID
      • EXISTING_PERSISTENT_DISK_NAME:现有永久性磁盘的名称。
      • EXISTING_PERSISTENT_DISK_LOCATION:现有永久性磁盘的位置。对于每个现有可用区级永久性磁盘,作业的所在位置必须是该磁盘所在的可用区;对于每个现有区域级永久性磁盘,作业的所在位置必须是该磁盘所在的区域,或者(如果指定可用区)是该区域级永久性磁盘所在的一个或两个特定可用区。如果您未指定任何现有永久性磁盘,则可以选择任何位置。详细了解 allowedLocations 字段
      • NEW_PERSISTENT_DISK_SIZE:新永久性磁盘的大小(以 GB 为单位)。允许的大小取决于永久性磁盘的类型,但最小值通常为 10 GB (10),最大值通常为 64 TB (64000)。
      • NEW_PERSISTENT_DISK_TYPE:新永久性磁盘的磁盘类型,即 pd-standardpd-balancedpd-ssdpd-extreme。非启动永久性磁盘的默认磁盘类型为 pd-standard
      • NEW_PERSISTENT_DISK_NAME:新永久性磁盘的名称。
    • 如果您为此作业使用虚拟机实例模板,请创建一个 JSON 文件,如前所示,但将 instances 字段替换为以下内容:

      "instances": [
          {
              "instanceTemplate": "INSTANCE_TEMPLATE_NAME"
          }
      ],
      

      其中 INSTANCE_TEMPLATE_NAME 是此作业的实例模板的名称。对于使用永久性磁盘的作业,此实例模板必须定义并挂接您希望作业使用的永久性磁盘。对于此示例,模板必须定义并挂接一个名为 NEW_PERSISTENT_DISK_NAME 的新永久性磁盘,并挂接一个名为 EXISTING_PERSISTENT_DISK_NAME 的现有永久性磁盘。

  2. 运行以下命令:

    gcloud batch jobs submit JOB_NAME \
      --location LOCATION \
      --config JSON_CONFIGURATION_FILE
    

    替换以下内容:

    • JOB_NAME:作业的名称。

    • LOCATION:作业的位置

    • JSON_CONFIGURATION_FILE:包含作业配置详细信息的 JSON 文件的路径。

API

以下示例使用 Batch API 创建了一个作业,用于挂接和装载现有永久性磁盘和新的永久性磁盘。该作业包含 3 个任务,每个任务都会运行一个脚本,以便在名为 output_task_TASK_INDEX.txt 的新永久性磁盘中创建文件,其中 TASK_INDEX 是每个任务的编号:0、1 和 2。

如需使用批处理 API 创建使用永久性磁盘的作业,请使用 jobs.create 方法。在请求中,在 instances 字段中指定永久性磁盘,并在 volumes 字段中装载永久性磁盘。

  • 如果您不为此作业使用实例模板,请发出以下请求:

    POST https://batch.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/jobs?job_id=JOB_NAME
    
    {
        "allocationPolicy": {
            "instances": [
                {
                    "policy": {
                        "disks": [
                            {
                                "deviceName": "EXISTING_PERSISTENT_DISK_NAME",
                                "existingDisk": "projects/PROJECT_ID/EXISTING_PERSISTENT_DISK_LOCATION/disks/EXISTING_PERSISTENT_DISK_NAME"
                            },
                            {
                                "newDisk": {
                                    "sizeGb": NEW_PERSISTENT_DISK_SIZE,
                                    "type": "NEW_PERSISTENT_DISK_TYPE"
                                },
                                "deviceName": "NEW_PERSISTENT_DISK_NAME"
                            }
                        ]
                    }
                }
            ],
            "location": {
                "allowedLocations": [
                    "EXISTING_PERSISTENT_DISK_LOCATION"
                ]
            }
        },
        "taskGroups": [
            {
                "taskSpec": {
                    "runnables": [
                        {
                            "script": {
                                "text": "echo Hello world from task ${BATCH_TASK_INDEX}. >> /mnt/disks/NEW_PERSISTENT_DISK_NAME/output_task_${BATCH_TASK_INDEX}.txt"
                            }
                        }
                    ],
                    "volumes": [
                        {
                            "deviceName": "NEW_PERSISTENT_DISK_NAME",
                            "mountPath": "/mnt/disks/NEW_PERSISTENT_DISK_NAME",
                            "mountOptions": "rw,async"
                        },
                        {
    
                            "deviceName": "EXISTING_PERSISTENT_DISK_NAME",
                            "mountPath": "/mnt/disks/EXISTING_PERSISTENT_DISK_NAME"
                        }
                    ]
                },
                "taskCount":3
            }
        ],
        "logsPolicy": {
            "destination": "CLOUD_LOGGING"
        }
    }
    

    替换以下内容:

    • PROJECT_ID:您的项目的项目 ID
    • LOCATION:作业的位置
    • JOB_NAME:作业的名称。
    • EXISTING_PERSISTENT_DISK_NAME:现有永久性磁盘的名称。
    • EXISTING_PERSISTENT_DISK_LOCATION:现有永久性磁盘的位置。对于每个现有可用区级永久性磁盘,作业的所在位置必须是磁盘所在的可用区;对于每个现有区域级永久性磁盘,作业的所在位置必须是磁盘所在的区域,或者(如果指定可用区)是区域级永久性磁盘所在的一个或两个特定可用区。如果您未指定任何现有永久性磁盘,则可以选择任何位置。详细了解 allowedLocations 字段
    • NEW_PERSISTENT_DISK_SIZE:新永久性磁盘的大小(以 GB 为单位)。允许的大小取决于永久性磁盘的类型,但最小值通常为 10 GB (10),最大值通常为 64 TB (64000)。
    • NEW_PERSISTENT_DISK_TYPE:新永久性磁盘的磁盘类型,即 pd-standardpd-balancedpd-ssdpd-extreme。非启动永久性磁盘的默认磁盘类型为 pd-standard
    • NEW_PERSISTENT_DISK_NAME:新永久性磁盘的名称。
  • 如果您为此作业使用虚拟机实例模板,请创建一个 JSON 文件,如前所示,但将 instances 字段替换为以下内容:

    "instances": [
        {
            "instanceTemplate": "INSTANCE_TEMPLATE_NAME"
        }
    ],
    ...
    

    其中 INSTANCE_TEMPLATE_NAME 是此作业的实例模板的名称。对于使用永久性磁盘的作业,此实例模板必须定义并挂接您希望作业使用的永久性磁盘。对于此示例,模板必须定义并挂接一个名为 NEW_PERSISTENT_DISK_NAME 的新永久性磁盘,并挂接一个名为 EXISTING_PERSISTENT_DISK_NAME 的现有永久性磁盘。

C++

如需使用 C++ 版 Cloud 客户端库创建使用新或现有永久性磁盘的批处理作业,请使用 CreateJob 函数,并添加以下内容:

  • 如需将永久性磁盘挂接到作业的虚拟机,请添加以下任一项:
    • 如果您不为此作业使用虚拟机实例模板,请使用 set_remote_path 方法。
    • 如果您为此作业使用虚拟机实例模板,请使用 set_instance_template 方法。
  • 如需将永久性磁盘挂载到作业,请将 volumes 字段与 deviceNamemountPath 字段搭配使用。对于新的永久性磁盘,还应使用 mountOptions 字段启用写入。

如需查看类似用例的代码示例,请参阅使用 Cloud Storage 存储分区

Go

如需使用 Go 版 Cloud 客户端库创建使用新或现有永久性磁盘的批量作业,请使用 CreateJob 函数,并添加以下内容:

import (
	"context"
	"fmt"
	"io"

	batch "cloud.google.com/go/batch/apiv1"
	"cloud.google.com/go/batch/apiv1/batchpb"
	durationpb "google.golang.org/protobuf/types/known/durationpb"
)

// Creates and runs a job with persistent disk
func createJobWithPD(w io.Writer, projectID, jobName, pdName string) error {
	// jobName := job-name
	// pdName := disk-name
	ctx := context.Background()
	batchClient, err := batch.NewClient(ctx)
	if err != nil {
		return fmt.Errorf("batchClient error: %w", err)
	}
	defer batchClient.Close()

	runn := &batchpb.Runnable{
		Executable: &batchpb.Runnable_Script_{
			Script: &batchpb.Runnable_Script{
				Command: &batchpb.Runnable_Script_Text{
					Text: "echo Hello world from script 1 for task ${BATCH_TASK_INDEX}",
				},
			},
		},
	}
	volume := &batchpb.Volume{
		MountPath: fmt.Sprintf("/mnt/disks/%v", pdName),
		Source: &batchpb.Volume_DeviceName{
			DeviceName: pdName,
		},
	}

	// The disk type of the new persistent disk, either pd-standard,
	// pd-balanced, pd-ssd, or pd-extreme. For Batch jobs, the default is pd-balanced
	disk := &batchpb.AllocationPolicy_Disk{
		Type:   "pd-balanced",
		SizeGb: 10,
	}

	taskSpec := &batchpb.TaskSpec{
		ComputeResource: &batchpb.ComputeResource{
			// CpuMilli is milliseconds per cpu-second. This means the task requires 1 CPU.
			CpuMilli:  1000,
			MemoryMib: 16,
		},
		MaxRunDuration: &durationpb.Duration{
			Seconds: 3600,
		},
		MaxRetryCount: 2,
		Runnables:     []*batchpb.Runnable{runn},
		Volumes:       []*batchpb.Volume{volume},
	}

	taskGroups := []*batchpb.TaskGroup{
		{
			TaskCount: 4,
			TaskSpec:  taskSpec,
		},
	}

	labels := map[string]string{"env": "testing", "type": "container"}

	// Policies are used to define on what kind of virtual machines the tasks will run on.
	// Read more about local disks here: https://cloud.google.com/compute/docs/disks/persistent-disks
	allocationPolicy := &batchpb.AllocationPolicy{
		Instances: []*batchpb.AllocationPolicy_InstancePolicyOrTemplate{{
			PolicyTemplate: &batchpb.AllocationPolicy_InstancePolicyOrTemplate_Policy{
				Policy: &batchpb.AllocationPolicy_InstancePolicy{
					MachineType: "n1-standard-1",
					Disks: []*batchpb.AllocationPolicy_AttachedDisk{
						{
							Attached: &batchpb.AllocationPolicy_AttachedDisk_NewDisk{
								NewDisk: disk,
							},
							DeviceName: pdName,
						},
					},
				},
			},
		}},
	}

	// We use Cloud Logging as it's an out of the box available option
	logsPolicy := &batchpb.LogsPolicy{
		Destination: batchpb.LogsPolicy_CLOUD_LOGGING,
	}

	job := &batchpb.Job{
		Name:             jobName,
		TaskGroups:       taskGroups,
		AllocationPolicy: allocationPolicy,
		Labels:           labels,
		LogsPolicy:       logsPolicy,
	}

	request := &batchpb.CreateJobRequest{
		Parent: fmt.Sprintf("projects/%s/locations/%s", projectID, "us-central1"),
		JobId:  jobName,
		Job:    job,
	}

	created_job, err := batchClient.CreateJob(ctx, request)
	if err != nil {
		return fmt.Errorf("unable to create job: %w", err)
	}

	fmt.Fprintf(w, "Job created: %v\n", created_job)
	return nil
}

Java

如需使用 Java 版 Cloud 客户端库创建使用新或现有永久性磁盘的批处理作业,请使用 CreateJobRequest,并添加以下内容:

例如,请使用以下代码示例:


import com.google.cloud.batch.v1.AllocationPolicy;
import com.google.cloud.batch.v1.AllocationPolicy.AttachedDisk;
import com.google.cloud.batch.v1.AllocationPolicy.Disk;
import com.google.cloud.batch.v1.AllocationPolicy.InstancePolicy;
import com.google.cloud.batch.v1.AllocationPolicy.InstancePolicyOrTemplate;
import com.google.cloud.batch.v1.AllocationPolicy.LocationPolicy;
import com.google.cloud.batch.v1.BatchServiceClient;
import com.google.cloud.batch.v1.CreateJobRequest;
import com.google.cloud.batch.v1.Job;
import com.google.cloud.batch.v1.LogsPolicy;
import com.google.cloud.batch.v1.Runnable;
import com.google.cloud.batch.v1.Runnable.Script;
import com.google.cloud.batch.v1.TaskGroup;
import com.google.cloud.batch.v1.TaskSpec;
import com.google.cloud.batch.v1.Volume;
import com.google.common.collect.Lists;
import com.google.protobuf.Duration;
import java.io.IOException;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;

public class CreatePersistentDiskJob {

  public static void main(String[] args)
      throws IOException, ExecutionException, InterruptedException, TimeoutException {
    // TODO(developer): Replace these variables before running the sample.
    // Project ID or project number of the Google Cloud project you want to use.
    String projectId = "YOUR_PROJECT_ID";
    // Name of the region you want to use to run the job. Regions that are
    // available for Batch are listed on: https://cloud.google.com/batch/docs/get-started#locations
    String region = "europe-central2";
    // The name of the job that will be created.
    // It needs to be unique for each project and region pair.
    String jobName = "JOB_NAME";
    // The size of the new persistent disk in GB.
    // The allowed sizes depend on the type of persistent disk,
    // but the minimum is often 10 GB (10) and the maximum is often 64 TB (64000).
    int diskSize = 10;
    // The name of the new persistent disk.
    String newPersistentDiskName = "DISK-NAME";
    // The name of an existing persistent disk.
    String existingPersistentDiskName = "EXISTING-DISK-NAME";
    // The location of an existing persistent disk. For more info :
    // https://cloud.google.com/batch/docs/create-run-job-storage#gcloud
    String location = "regions/us-central1";
    // The disk type of the new persistent disk, either pd-standard,
    // pd-balanced, pd-ssd, or pd-extreme. For Batch jobs, the default is pd-balanced.
    String newDiskType = "pd-balanced";

    createPersistentDiskJob(projectId, region, jobName, newPersistentDiskName,
            diskSize, existingPersistentDiskName, location, newDiskType);
  }

  // Creates a job that attaches and mounts an existing persistent disk and a new persistent disk
  public static Job createPersistentDiskJob(String projectId, String region, String jobName,
                                            String newPersistentDiskName, int diskSize,
                                            String existingPersistentDiskName,
                                            String location, String newDiskType)
      throws IOException, ExecutionException, InterruptedException, TimeoutException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests.
    try (BatchServiceClient batchServiceClient = BatchServiceClient.create()) {
      // Define what will be done as part of the job.
      String text = "echo Hello world from task ${BATCH_TASK_INDEX}. "
              + ">> /mnt/disks/NEW_PERSISTENT_DISK_NAME/output_task_${BATCH_TASK_INDEX}.txt";
      Runnable runnable =
          Runnable.newBuilder()
              .setScript(
                  Script.newBuilder()
                      .setText(text)
                      // You can also run a script from a file. Just remember, that needs to be a
                      // script that's already on the VM that will be running the job.
                      // Using setText() and setPath() is mutually exclusive.
                      // .setPath("/tmp/test.sh")
                      .build())
              .build();

      TaskSpec task = TaskSpec.newBuilder()
              // Jobs can be divided into tasks. In this case, we have only one task.
              .addAllVolumes(volumes(newPersistentDiskName, existingPersistentDiskName))
              .addRunnables(runnable)
              .setMaxRetryCount(2)
              .setMaxRunDuration(Duration.newBuilder().setSeconds(3600).build())
              .build();

      // Tasks are grouped inside a job using TaskGroups.
      // Currently, it's possible to have only one task group.
      TaskGroup taskGroup = TaskGroup.newBuilder()
          .setTaskCount(3)
          .setParallelism(1)
          .setTaskSpec(task)
          .build();

      // Policies are used to define the type of virtual machines the tasks will run on.
      InstancePolicy policy = InstancePolicy.newBuilder()
              .addAllDisks(attachedDisks(newPersistentDiskName, diskSize, newDiskType,
                  projectId, location, existingPersistentDiskName))
              .build();

      AllocationPolicy allocationPolicy =
          AllocationPolicy.newBuilder()
              .addInstances(
                  InstancePolicyOrTemplate.newBuilder()
                      .setPolicy(policy))
                  .setLocation(LocationPolicy.newBuilder().addAllowedLocations(location))
              .build();

      Job job =
          Job.newBuilder()
              .addTaskGroups(taskGroup)
              .setAllocationPolicy(allocationPolicy)
              .putLabels("env", "testing")
              .putLabels("type", "script")
              // We use Cloud Logging as it's an out-of-the-box option.
              .setLogsPolicy(
                  LogsPolicy.newBuilder().setDestination(LogsPolicy.Destination.CLOUD_LOGGING))
              .build();

      CreateJobRequest createJobRequest =
          CreateJobRequest.newBuilder()
              // The job's parent is the region in which the job will run.
              .setParent(String.format("projects/%s/locations/%s", projectId, region))
              .setJob(job)
              .setJobId(jobName)
              .build();

      Job result =
          batchServiceClient
              .createJobCallable()
              .futureCall(createJobRequest)
              .get(5, TimeUnit.MINUTES);

      System.out.printf("Successfully created the job: %s", result.getName());

      return result;
    }
  }

  // Creates link to existing disk and creates configuration for new disk
  private static Iterable<AttachedDisk> attachedDisks(String newPersistentDiskName, int diskSize,
                                                      String newDiskType, String projectId,
                                                      String existingPersistentDiskLocation,
                                                      String existingPersistentDiskName) {
    AttachedDisk newDisk = AttachedDisk.newBuilder()
            .setDeviceName(newPersistentDiskName)
            .setNewDisk(Disk.newBuilder().setSizeGb(diskSize).setType(newDiskType))
            .build();

    String diskPath = String.format("projects/%s/%s/disks/%s", projectId,
            existingPersistentDiskLocation, existingPersistentDiskName);

    AttachedDisk existingDisk = AttachedDisk.newBuilder()
            .setDeviceName(existingPersistentDiskName)
            .setExistingDisk(diskPath)
            .build();

    return Lists.newArrayList(existingDisk, newDisk);
  }

  // Describes a volume and parameters for it to be mounted to a VM.
  private static Iterable<Volume> volumes(String newPersistentDiskName,
                                          String existingPersistentDiskName) {
    Volume newVolume = Volume.newBuilder()
            .setDeviceName(newPersistentDiskName)
            .setMountPath("/mnt/disks/" + newPersistentDiskName)
            .addMountOptions("rw")
            .addMountOptions("async")
            .build();

    Volume existingVolume = Volume.newBuilder()
            .setDeviceName(existingPersistentDiskName)
            .setMountPath("/mnt/disks/" + existingPersistentDiskName)
            .build();

    return Lists.newArrayList(newVolume, existingVolume);
  }
}

Node.js

如需使用 Node.js 版 Cloud 客户端库创建使用新或现有永久性磁盘的批量作业,请使用 createJob 方法并添加以下内容:

// Imports the Batch library
const batchLib = require('@google-cloud/batch');
const batch = batchLib.protos.google.cloud.batch.v1;

// Instantiates a client
const batchClient = new batchLib.v1.BatchServiceClient();

/**
 * TODO(developer): Update these variables before running the sample.
 */
// Project ID or project number of the Google Cloud project you want to use.
const projectId = await batchClient.getProjectId();
// The name of the job that will be created.
// It needs to be unique for each project and region pair.
const jobName = 'batch-create-persistent-disk-job';
// Name of the region you want to use to run the job. Regions that are
// available for Batch are listed on: https://cloud.google.com/batch/docs/get-started#locations
const region = 'europe-central2';
// The name of an existing persistent disk.
const existingPersistentDiskName = 'existing-persistent-disk-name';
// The name of the new persistent disk.
const newPersistentDiskName = 'new-persistent-disk-name';
// The size of the new persistent disk in GB.
// The allowed sizes depend on the type of persistent disk,
// but the minimum is often 10 GB (10) and the maximum is often 64 TB (64000).
const diskSize = 10;
// The location of an existing persistent disk. For more info :
// https://cloud.google.com/batch/docs/create-run-job-storage#gcloud
const location = 'regions/us-central1';
// The disk type of the new persistent disk, either pd-standard,
// pd-balanced, pd-ssd, or pd-extreme. For Batch jobs, the default is pd-balanced.
const newDiskType = 'pd-balanced';

// Define what will be done as part of the job.
const runnable = new batch.Runnable({
  script: new batch.Runnable.Script({
    commands: [
      '-c',
      'echo Hello world! This is task ${BATCH_TASK_INDEX}.' +
        '>> /mnt/disks/NEW_PERSISTENT_DISK_NAME/output_task_${BATCH_TASK_INDEX}.txt',
    ],
  }),
});

// Define volumes and their parameters to be mounted to a VM.
const newVolume = new batch.Volume({
  deviceName: newPersistentDiskName,
  mountPath: `/mnt/disks/${newPersistentDiskName}`,
  mountOptions: ['rw', 'async'],
});

const existingVolume = new batch.Volume({
  deviceName: existingPersistentDiskName,
  mountPath: `/mnt/disks/${existingPersistentDiskName}`,
});

const task = new batch.TaskSpec({
  runnables: [runnable],
  volumes: [newVolume, existingVolume],
  maxRetryCount: 2,
  maxRunDuration: {seconds: 3600},
});

// Tasks are grouped inside a job using TaskGroups.
const group = new batch.TaskGroup({
  taskCount: 3,
  taskSpec: task,
});

const newDisk = new batch.AllocationPolicy.Disk({
  type: newDiskType,
  sizeGb: diskSize,
});

// Policies are used to define on what kind of virtual machines the tasks will run on.
// Read more about local disks here: https://cloud.google.com/compute/docs/disks/persistent-disks
const instancePolicy = new batch.AllocationPolicy.InstancePolicy({
  disks: [
    // Create configuration for new disk
    new batch.AllocationPolicy.AttachedDisk({
      deviceName: newPersistentDiskName,
      newDisk,
    }),
    // Create link to existing disk
    new batch.AllocationPolicy.AttachedDisk({
      existingDisk: `projects/${projectId}/${location}/disks/${existingPersistentDiskName}`,
      deviceName: existingPersistentDiskName,
    }),
  ],
});

const locationPolicy = new batch.AllocationPolicy.LocationPolicy({
  allowedLocations: [location],
});

const allocationPolicy = new batch.AllocationPolicy.InstancePolicyOrTemplate({
  instances: [{policy: instancePolicy}],
  location: locationPolicy,
});

const job = new batch.Job({
  name: jobName,
  taskGroups: [group],
  labels: {env: 'testing', type: 'script'},
  allocationPolicy,
  // We use Cloud Logging as it's an option available out of the box
  logsPolicy: new batch.LogsPolicy({
    destination: batch.LogsPolicy.Destination.CLOUD_LOGGING,
  }),
});
// The job's parent is the project and region in which the job will run
const parent = `projects/${projectId}/locations/${region}`;

async function callCreateBatchPersistentDiskJob() {
  // Construct request
  const request = {
    parent,
    jobId: jobName,
    job,
  };

  // Run request
  const [response] = await batchClient.createJob(request);
  console.log(JSON.stringify(response));
}

await callCreateBatchPersistentDiskJob();

Python

如需使用 Python 版 Cloud 客户端库创建使用新或现有永久性磁盘的批处理作业,请使用 CreateJob 函数,并添加以下内容:

  • 如需将永久性磁盘挂接到作业的虚拟机,请添加以下任一项:
  • 如需将永久性磁盘挂载到作业,请将 Volumedevice_name 属性和 mount_path 属性搭配使用。对于新的永久性磁盘,还应使用 mount_options 属性启用写入。

例如,请使用以下代码示例:

from google.cloud import batch_v1


def create_with_pd_job(
    project_id: str,
    region: str,
    job_name: str,
    disk_name: str,
    zone: str,
    existing_disk_name=None,
) -> batch_v1.Job:
    """
    This method shows how to create a sample Batch Job that will run
    a simple command on Cloud Compute instances with mounted persistent disk.

    Args:
        project_id: project ID or project number of the Cloud project you want to use.
        region: name of the region you want to use to run the job. Regions that are
            available for Batch are listed on: https://cloud.google.com/batch/docs/get-started#locations
        job_name: the name of the job that will be created.
            It needs to be unique for each project and region pair.
        disk_name: name of the disk to be mounted for your Job.
        existing_disk_name(optional): existing disk name, which you want to attach to a job

    Returns:
        A job object representing the job created.
    """
    client = batch_v1.BatchServiceClient()

    # Define what will be done as part of the job.
    task = batch_v1.TaskSpec()
    runnable = batch_v1.Runnable()
    runnable.script = batch_v1.Runnable.Script()
    runnable.script.text = (
        "echo Hello world from task ${BATCH_TASK_INDEX}. >> /mnt/disks/"
        + disk_name
        + "/output_task_${BATCH_TASK_INDEX}.txt"
    )
    task.runnables = [runnable]
    task.max_retry_count = 2
    task.max_run_duration = "3600s"

    volume = batch_v1.Volume()
    volume.device_name = disk_name
    volume.mount_path = f"/mnt/disks/{disk_name}"
    task.volumes = [volume]

    if existing_disk_name:
        volume2 = batch_v1.Volume()
        volume2.device_name = existing_disk_name
        volume2.mount_path = f"/mnt/disks/{existing_disk_name}"
        task.volumes.append(volume2)

    # Tasks are grouped inside a job using TaskGroups.
    # Currently, it's possible to have only one task group.
    group = batch_v1.TaskGroup()
    group.task_count = 4
    group.task_spec = task

    disk = batch_v1.AllocationPolicy.Disk()
    # The disk type of the new persistent disk, either pd-standard,
    # pd-balanced, pd-ssd, or pd-extreme. For Batch jobs, the default is pd-balanced
    disk.type_ = "pd-balanced"
    disk.size_gb = 10

    # Policies are used to define on what kind of virtual machines the tasks will run on.
    # Read more about local disks here: https://cloud.google.com/compute/docs/disks/persistent-disks
    policy = batch_v1.AllocationPolicy.InstancePolicy()
    policy.machine_type = "n1-standard-1"

    attached_disk = batch_v1.AllocationPolicy.AttachedDisk()
    attached_disk.new_disk = disk
    attached_disk.device_name = disk_name
    policy.disks = [attached_disk]

    if existing_disk_name:
        attached_disk2 = batch_v1.AllocationPolicy.AttachedDisk()
        attached_disk2.existing_disk = (
            f"projects/{project_id}/zones/{zone}/disks/{existing_disk_name}"
        )
        attached_disk2.device_name = existing_disk_name
        policy.disks.append(attached_disk2)

    instances = batch_v1.AllocationPolicy.InstancePolicyOrTemplate()
    instances.policy = policy

    allocation_policy = batch_v1.AllocationPolicy()
    allocation_policy.instances = [instances]

    location = batch_v1.AllocationPolicy.LocationPolicy()
    location.allowed_locations = [f"zones/{zone}"]
    allocation_policy.location = location

    job = batch_v1.Job()
    job.task_groups = [group]
    job.allocation_policy = allocation_policy
    job.labels = {"env": "testing", "type": "script"}

    create_request = batch_v1.CreateJobRequest()
    create_request.job = job
    create_request.job_id = job_name
    # The job's parent is the region in which the job will run
    create_request.parent = f"projects/{project_id}/locations/{region}"

    return client.create_job(create_request)

使用本地 SSD

使用本地 SSD 的作业存在以下限制:

您可以使用 gcloud CLI、Batch API、Java 或 Python 创建使用本地 SSD 的作业。以下示例介绍了如何创建用于创建、附加和挂载本地 SSD 的作业。该作业还有 3 个任务,每个任务都会运行一个脚本,以便在名为 output_task_TASK_INDEX.txt 的本地 SSD 中创建文件,其中 TASK_INDEX 是每个任务的编号:012

gcloud

如需使用 gcloud CLI 创建使用本地 SSD 的作业,请使用 gcloud batch jobs submit 命令。在作业的 JSON 配置文件中,在 instances 字段中创建并附加本地 SSD,并在 volumes 字段中挂载本地 SSD。

  1. 创建一个 JSON 文件。

    • 如果您不为此作业使用实例模板,请创建一个 JSON 文件,其中包含以下内容:

      {
          "allocationPolicy": {
              "instances": [
                  {
                      "policy": {
                          "machineType": MACHINE_TYPE,
                          "disks": [
                              {
                                  "newDisk": {
                                      "sizeGb": LOCAL_SSD_SIZE,
                                      "type": "local-ssd"
                                  },
                                  "deviceName": "LOCAL_SSD_NAME"
                              }
                          ]
                      }
                  }
              ]
          },
          "taskGroups": [
              {
                  "taskSpec": {
                      "runnables": [
                          {
                              "script": {
                                  "text": "echo Hello world from task ${BATCH_TASK_INDEX}. >> /mnt/disks/LOCAL_SSD_NAME/output_task_${BATCH_TASK_INDEX}.txt"
                              }
                          }
                      ],
                      "volumes": [
                          {
                              "deviceName": "LOCAL_SSD_NAME",
                              "mountPath": "/mnt/disks/LOCAL_SSD_NAME",
                              "mountOptions": "rw,async"
                          }
                      ]
                  },
                  "taskCount":3
              }
          ],
          "logsPolicy": {
              "destination": "CLOUD_LOGGING"
          }
      }
      

      替换以下内容:

      • MACHINE_TYPE:作业虚拟机的机器类型,可以是预定义自定义允许的本地 SSD 数量取决于作业虚拟机的机器类型。
      • LOCAL_SSD_NAME:为此作业创建的本地 SSD 的名称。
      • LOCAL_SSD_SIZE:所有本地 SSD 的大小(以 GB 为单位)。每个本地 SSD 为 375 GB,因此此值必须为 375 GB 的倍数。例如,对于 2 个本地 SSD,请将此值设为 750 GB。
    • 如果您为此作业使用虚拟机实例模板,请创建一个 JSON 文件,如前所示,但将 instances 字段替换为以下内容:

      "instances": [
          {
              "instanceTemplate": "INSTANCE_TEMPLATE_NAME"
          }
      ],
      

      其中 INSTANCE_TEMPLATE_NAME 是此作业的实例模板的名称。对于使用本地 SSD 的作业,此实例模板必须定义并附加您希望作业使用的本地 SSD。对于此示例,模板必须定义并附加一个名为 LOCAL_SSD_NAME 的本地 SSD。

  2. 运行以下命令:

    gcloud batch jobs submit JOB_NAME \
      --location LOCATION \
      --config JSON_CONFIGURATION_FILE
    

    替换以下内容:

    • JOB_NAME:作业的名称。
    • LOCATION:作业的位置
    • JSON_CONFIGURATION_FILE:包含作业配置详细信息的 JSON 文件的路径。

API

如需使用批处理 API 创建使用本地固态硬盘的作业,请使用 jobs.create 方法。在请求中,在 instances 字段中创建并附加本地 SSD,并在 volumes 字段中挂载本地 SSD。

  • 如果您不为此作业使用实例模板,请发出以下请求:

    POST https://batch.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/jobs?job_id=JOB_NAME
    
    {
        "allocationPolicy": {
            "instances": [
                {
                    "policy": {
                        "machineType": MACHINE_TYPE,
                        "disks": [
                            {
                                "newDisk": {
                                    "sizeGb": LOCAL_SSD_SIZE,
                                    "type": "local-ssd"
                                },
                                "deviceName": "LOCAL_SSD_NAME"
                            }
                        ]
                    }
                }
            ]
        },
        "taskGroups": [
            {
                "taskSpec": {
                    "runnables": [
                        {
                            "script": {
                                "text": "echo Hello world from task ${BATCH_TASK_INDEX}. >> /mnt/disks/LOCAL_SSD_NAME/output_task_${BATCH_TASK_INDEX}.txt"
                            }
                        }
                    ],
                    "volumes": [
                        {
                            "deviceName": "LOCAL_SSD_NAME",
                            "mountPath": "/mnt/disks/LOCAL_SSD_NAME",
                            "mountOptions": "rw,async"
                        }
                    ]
                },
                "taskCount":3
            }
        ],
        "logsPolicy": {
            "destination": "CLOUD_LOGGING"
        }
    }
    

    替换以下内容:

    • PROJECT_ID:您的项目的项目 ID
    • LOCATION:作业的位置
    • JOB_NAME:作业的名称。
    • MACHINE_TYPE:作业虚拟机的机器类型,可以是预定义自定义允许的本地 SSD 数量取决于作业虚拟机的机器类型。
    • LOCAL_SSD_NAME:为此作业创建的本地 SSD 的名称。
    • LOCAL_SSD_SIZE:所有本地 SSD 的大小(以 GB 为单位)。每个本地 SSD 为 375 GB,因此此值必须为 375 GB 的倍数。例如,对于 2 个本地 SSD,请将此值设为 750 GB。
  • 如果您为此作业使用虚拟机实例模板,请创建一个 JSON 文件,如前所示,但将 instances 字段替换为以下内容:

    "instances": [
        {
            "instanceTemplate": "INSTANCE_TEMPLATE_NAME"
        }
    ],
    ...
    

    其中 INSTANCE_TEMPLATE_NAME 是此作业的实例模板的名称。对于使用本地 SSD 的作业,此实例模板必须定义并附加您希望作业使用的本地 SSD。对于此示例,模板必须定义并附加一个名为 LOCAL_SSD_NAME 的本地 SSD。

Go

import (
	"context"
	"fmt"
	"io"

	batch "cloud.google.com/go/batch/apiv1"
	"cloud.google.com/go/batch/apiv1/batchpb"
	durationpb "google.golang.org/protobuf/types/known/durationpb"
)

// Creates and runs a job with local SSD
// Note: local SSD does not guarantee Local SSD data persistence.
// More details here: https://cloud.google.com/compute/docs/disks/local-ssd#data_persistence
func createJobWithSSD(w io.Writer, projectID, jobName, ssdName string) error {
	// jobName := job-name
	// ssdName := disk-name
	ctx := context.Background()
	batchClient, err := batch.NewClient(ctx)
	if err != nil {
		return fmt.Errorf("batchClient error: %w", err)
	}
	defer batchClient.Close()

	runn := &batchpb.Runnable{
		Executable: &batchpb.Runnable_Script_{
			Script: &batchpb.Runnable_Script{
				Command: &batchpb.Runnable_Script_Text{
					Text: "echo Hello world from script 1 for task ${BATCH_TASK_INDEX}",
				},
			},
		},
	}
	volume := &batchpb.Volume{
		MountPath: fmt.Sprintf("/mnt/disks/%v", ssdName),
		Source: &batchpb.Volume_DeviceName{
			DeviceName: ssdName,
		},
	}

	// The size of all the local SSDs in GB. Each local SSD is 375 GB,
	// so this value must be a multiple of 375 GB.
	// For example, for 2 local SSDs, set this value to 750 GB.
	disk := &batchpb.AllocationPolicy_Disk{
		Type:   "local-ssd",
		SizeGb: 375,
	}

	taskSpec := &batchpb.TaskSpec{
		ComputeResource: &batchpb.ComputeResource{
			// CpuMilli is milliseconds per cpu-second. This means the task requires 1 CPU.
			CpuMilli:  1000,
			MemoryMib: 16,
		},
		MaxRunDuration: &durationpb.Duration{
			Seconds: 3600,
		},
		MaxRetryCount: 2,
		Runnables:     []*batchpb.Runnable{runn},
		Volumes:       []*batchpb.Volume{volume},
	}

	taskGroups := []*batchpb.TaskGroup{
		{
			TaskCount: 4,
			TaskSpec:  taskSpec,
		},
	}

	labels := map[string]string{"env": "testing", "type": "container"}

	allocationPolicy := &batchpb.AllocationPolicy{
		Instances: []*batchpb.AllocationPolicy_InstancePolicyOrTemplate{{
			PolicyTemplate: &batchpb.AllocationPolicy_InstancePolicyOrTemplate_Policy{
				Policy: &batchpb.AllocationPolicy_InstancePolicy{
					// The allowed number of local SSDs depends on the machine type for your job's VMs.
					// In this case, we tell the system to use "n1-standard-1" machine type, which require to attach local ssd manually.
					// Read more about local disks here: https://cloud.google.com/compute/docs/disks/local-ssd#lssd_disk_options
					MachineType: "n1-standard-1",
					Disks: []*batchpb.AllocationPolicy_AttachedDisk{
						{
							Attached: &batchpb.AllocationPolicy_AttachedDisk_NewDisk{
								NewDisk: disk,
							},
							DeviceName: ssdName,
						},
					},
				},
			},
		}},
	}

	// We use Cloud Logging as it's an out of the box available option
	logsPolicy := &batchpb.LogsPolicy{
		Destination: batchpb.LogsPolicy_CLOUD_LOGGING,
	}

	job := &batchpb.Job{
		Name:             jobName,
		TaskGroups:       taskGroups,
		AllocationPolicy: allocationPolicy,
		Labels:           labels,
		LogsPolicy:       logsPolicy,
	}

	request := &batchpb.CreateJobRequest{
		Parent: fmt.Sprintf("projects/%s/locations/%s", projectID, "us-central1"),
		JobId:  jobName,
		Job:    job,
	}

	created_job, err := batchClient.CreateJob(ctx, request)
	if err != nil {
		return fmt.Errorf("unable to create job: %w", err)
	}

	fmt.Fprintf(w, "Job created: %v\n", created_job)
	return nil
}

Java


import com.google.cloud.batch.v1.AllocationPolicy;
import com.google.cloud.batch.v1.AllocationPolicy.AttachedDisk;
import com.google.cloud.batch.v1.AllocationPolicy.Disk;
import com.google.cloud.batch.v1.AllocationPolicy.InstancePolicy;
import com.google.cloud.batch.v1.AllocationPolicy.InstancePolicyOrTemplate;
import com.google.cloud.batch.v1.BatchServiceClient;
import com.google.cloud.batch.v1.CreateJobRequest;
import com.google.cloud.batch.v1.Job;
import com.google.cloud.batch.v1.LogsPolicy;
import com.google.cloud.batch.v1.Runnable;
import com.google.cloud.batch.v1.Runnable.Script;
import com.google.cloud.batch.v1.TaskGroup;
import com.google.cloud.batch.v1.TaskSpec;
import com.google.cloud.batch.v1.Volume;
import com.google.protobuf.Duration;
import java.io.IOException;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;

public class CreateLocalSsdJob {

  public static void main(String[] args)
      throws IOException, ExecutionException, InterruptedException, TimeoutException {
    // TODO(developer): Replace these variables before running the sample.
    // Project ID or project number of the Google Cloud project you want to use.
    String projectId = "YOUR_PROJECT_ID";
    // Name of the region you want to use to run the job. Regions that are
    // available for Batch are listed on: https://cloud.google.com/batch/docs/get-started#locations
    String region = "europe-central2";
    // The name of the job that will be created.
    // It needs to be unique for each project and region pair.
    String jobName = "JOB_NAME";
    // The name of a local SSD created for this job.
    String localSsdName = "SSD-NAME";
    // The machine type, which can be predefined or custom, of the job's VMs.
    // The allowed number of local SSDs depends on the machine type
    // for your job's VMs are listed on: https://cloud.google.com/compute/docs/disks#localssds
    String machineType = "c3d-standard-8-lssd";
    // The size of all the local SSDs in GB. Each local SSD is 375 GB,
    // so this value must be a multiple of 375 GB.
    // For example, for 2 local SSDs, set this value to 750 GB.
    int ssdSize = 375;

    createLocalSsdJob(projectId, region, jobName, localSsdName, ssdSize, machineType);
  }

  // Create a job that uses local SSDs
  public static Job createLocalSsdJob(String projectId, String region, String jobName,
                                      String localSsdName, int ssdSize, String machineType)
      throws IOException, ExecutionException, InterruptedException, TimeoutException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests.
    try (BatchServiceClient batchServiceClient = BatchServiceClient.create()) {
      // Define what will be done as part of the job.
      Runnable runnable =
          Runnable.newBuilder()
              .setScript(
                  Script.newBuilder()
                      .setText(
                          "echo Hello world! This is task ${BATCH_TASK_INDEX}. "
                                  + "This job has a total of ${BATCH_TASK_COUNT} tasks.")
                      // You can also run a script from a file. Just remember, that needs to be a
                      // script that's already on the VM that will be running the job.
                      // Using setText() and setPath() is mutually exclusive.
                      // .setPath("/tmp/test.sh")
                      .build())
              .build();

      Volume volume = Volume.newBuilder()
          .setDeviceName(localSsdName)
          .setMountPath("/mnt/disks/" + localSsdName)
          .addMountOptions("rw")
          .addMountOptions("async")
          .build();

      TaskSpec task = TaskSpec.newBuilder()
          // Jobs can be divided into tasks. In this case, we have only one task.
          .addVolumes(volume)
          .addRunnables(runnable)
          .setMaxRetryCount(2)
          .setMaxRunDuration(Duration.newBuilder().setSeconds(3600).build())
          .build();

      // Tasks are grouped inside a job using TaskGroups.
      // Currently, it's possible to have only one task group.
      TaskGroup taskGroup = TaskGroup.newBuilder()
          .setTaskCount(3)
          .setParallelism(1)
          .setTaskSpec(task)
          .build();

      // Policies are used to define on what kind of virtual machines the tasks will run on.
      InstancePolicy policy = InstancePolicy.newBuilder()
          .setMachineType(machineType)
          .addDisks(AttachedDisk.newBuilder()
              .setDeviceName(localSsdName)
              // For example, local SSD uses type "local-ssd".
              // Persistent disks and boot disks use "pd-balanced", "pd-extreme", "pd-ssd"
              // or "pd-standard".
              .setNewDisk(Disk.newBuilder().setSizeGb(ssdSize).setType("local-ssd")))
          .build();

      AllocationPolicy allocationPolicy =
          AllocationPolicy.newBuilder()
              .addInstances(
                  InstancePolicyOrTemplate.newBuilder()
                      .setPolicy(policy)
                      .build())
              .build();

      Job job =
          Job.newBuilder()
              .addTaskGroups(taskGroup)
              .setAllocationPolicy(allocationPolicy)
              .putLabels("env", "testing")
              .putLabels("type", "script")
              // We use Cloud Logging as it's an out of the box available option.
              .setLogsPolicy(
                  LogsPolicy.newBuilder().setDestination(LogsPolicy.Destination.CLOUD_LOGGING))
              .build();

      CreateJobRequest createJobRequest =
          CreateJobRequest.newBuilder()
              // The job's parent is the region in which the job will run.
              .setParent(String.format("projects/%s/locations/%s", projectId, region))
              .setJob(job)
              .setJobId(jobName)
              .build();

      Job result =
          batchServiceClient
              .createJobCallable()
              .futureCall(createJobRequest)
              .get(5, TimeUnit.MINUTES);

      System.out.printf("Successfully created the job: %s", result.getName());

      return result;
    }
  }
}

Node.js

// Imports the Batch library
const batchLib = require('@google-cloud/batch');
const batch = batchLib.protos.google.cloud.batch.v1;

// Instantiates a client
const batchClient = new batchLib.v1.BatchServiceClient();

/**
 * TODO(developer): Update these variables before running the sample.
 */
// Project ID or project number of the Google Cloud project you want to use.
const projectId = await batchClient.getProjectId();
// Name of the region you want to use to run the job. Regions that are
// available for Batch are listed on: https://cloud.google.com/batch/docs/get-started#locations
const region = 'europe-central2';
// The name of the job that will be created.
// It needs to be unique for each project and region pair.
const jobName = 'batch-local-ssd-job';
// The name of a local SSD created for this job.
const localSsdName = 'ssd-name';
// The machine type, which can be predefined or custom, of the job's VMs.
// The allowed number of local SSDs depends on the machine type
// for your job's VMs are listed on: https://cloud.google.com/compute/docs/disks#localssds
const machineType = 'c3d-standard-8-lssd';
// The size of all the local SSDs in GB. Each local SSD is 375 GB,
// so this value must be a multiple of 375 GB.
// For example, for 2 local SSDs, set this value to 750 GB.
const ssdSize = 375;

// Define what will be done as part of the job.
const runnable = new batch.Runnable({
  script: new batch.Runnable.Script({
    commands: [
      '-c',
      'echo Hello world! This is task ${BATCH_TASK_INDEX}. This job has a total of ${BATCH_TASK_COUNT} tasks.',
    ],
  }),
});

const task = new batch.TaskSpec({
  runnables: [runnable],
  maxRetryCount: 2,
  maxRunDuration: {seconds: 3600},
});

// Tasks are grouped inside a job using TaskGroups.
const group = new batch.TaskGroup({
  taskCount: 3,
  taskSpec: task,
});

// Policies are used to define on what kind of virtual machines the tasks will run on.
const instancePolicy = new batch.AllocationPolicy.InstancePolicy({
  machineType,
  disks: [
    new batch.AllocationPolicy.AttachedDisk({
      deviceName: localSsdName,
      // For example, local SSD uses type "local-ssd".
      // Persistent disks and boot disks use "pd-balanced", "pd-extreme", "pd-ssd"
      // or "pd-standard".
      newDisk: new batch.AllocationPolicy.AttachedDisk({
        type: 'local-ssd',
        sizeGb: ssdSize,
      }),
    }),
  ],
});

const allocationPolicy = new batch.AllocationPolicy.InstancePolicyOrTemplate({
  instances: [{policy: instancePolicy}],
});

const job = new batch.Job({
  name: jobName,
  taskGroups: [group],
  labels: {env: 'testing', type: 'script'},
  allocationPolicy,
  // We use Cloud Logging as it's an option available out of the box
  logsPolicy: new batch.LogsPolicy({
    destination: batch.LogsPolicy.Destination.CLOUD_LOGGING,
  }),
});
// The job's parent is the project and region in which the job will run
const parent = `projects/${projectId}/locations/${region}`;

async function callCreateBatchGPUJob() {
  // Construct request
  const request = {
    parent,
    jobId: jobName,
    job,
  };

  // Run request
  const [response] = await batchClient.createJob(request);
  console.log(JSON.stringify(response));
}

await callCreateBatchGPUJob();

Python

from google.cloud import batch_v1


def create_local_ssd_job(
    project_id: str, region: str, job_name: str, ssd_name: str
) -> batch_v1.Job:
    """
    This method shows how to create a sample Batch Job that will run
    a simple command on Cloud Compute instances with mounted local SSD.
    Note: local SSD does not guarantee Local SSD data persistence.
    More details here: https://cloud.google.com/compute/docs/disks/local-ssd#data_persistence

    Args:
        project_id: project ID or project number of the Cloud project you want to use.
        region: name of the region you want to use to run the job. Regions that are
            available for Batch are listed on: https://cloud.google.com/batch/docs/get-started#locations
        job_name: the name of the job that will be created.
            It needs to be unique for each project and region pair.
        ssd_name: name of the local ssd to be mounted for your Job.

    Returns:
        A job object representing the job created.
    """
    client = batch_v1.BatchServiceClient()

    # Define what will be done as part of the job.
    task = batch_v1.TaskSpec()
    runnable = batch_v1.Runnable()
    runnable.script = batch_v1.Runnable.Script()
    runnable.script.text = "echo Hello world! This is task ${BATCH_TASK_INDEX}. This job has a total of ${BATCH_TASK_COUNT} tasks."
    task.runnables = [runnable]
    task.max_retry_count = 2
    task.max_run_duration = "3600s"

    volume = batch_v1.Volume()
    volume.device_name = ssd_name
    volume.mount_path = f"/mnt/disks/{ssd_name}"
    task.volumes = [volume]

    # Tasks are grouped inside a job using TaskGroups.
    # Currently, it's possible to have only one task group.
    group = batch_v1.TaskGroup()
    group.task_count = 4
    group.task_spec = task

    disk = batch_v1.AllocationPolicy.Disk()
    disk.type_ = "local-ssd"
    # The size of all the local SSDs in GB. Each local SSD is 375 GB,
    # so this value must be a multiple of 375 GB.
    # For example, for 2 local SSDs, set this value to 750 GB.
    disk.size_gb = 375
    assert disk.size_gb % 375 == 0

    # Policies are used to define on what kind of virtual machines the tasks will run on.
    # The allowed number of local SSDs depends on the machine type for your job's VMs.
    # In this case, we tell the system to use "n1-standard-1" machine type, which require to attach local ssd manually.
    # Read more about local disks here: https://cloud.google.com/compute/docs/disks/local-ssd#lssd_disk_options
    policy = batch_v1.AllocationPolicy.InstancePolicy()
    policy.machine_type = "n1-standard-1"

    attached_disk = batch_v1.AllocationPolicy.AttachedDisk()
    attached_disk.new_disk = disk
    attached_disk.device_name = ssd_name
    policy.disks = [attached_disk]

    instances = batch_v1.AllocationPolicy.InstancePolicyOrTemplate()
    instances.policy = policy

    allocation_policy = batch_v1.AllocationPolicy()
    allocation_policy.instances = [instances]

    job = batch_v1.Job()
    job.task_groups = [group]
    job.allocation_policy = allocation_policy
    job.labels = {"env": "testing", "type": "script"}
    # We use Cloud Logging as it's an out of the box available option
    job.logs_policy = batch_v1.LogsPolicy()
    job.logs_policy.destination = batch_v1.LogsPolicy.Destination.CLOUD_LOGGING

    create_request = batch_v1.CreateJobRequest()
    create_request.job = job
    create_request.job_id = job_name
    # The job's parent is the region in which the job will run
    create_request.parent = f"projects/{project_id}/locations/{region}"

    return client.create_job(create_request)

使用 Cloud Storage 存储桶

如需创建使用现有 Cloud Storage 存储分区的作业,请选择以下方法之一:

  • 建议:在作业的定义中指定存储分区,以便将存储分区直接挂载到作业的虚拟机,如本部分所示。作业运行时,系统会使用 Cloud Storage FUSE 自动将该存储分区挂载到作业的虚拟机。
  • 使用 gcloud CLI 或 Cloud Storage API 的客户端库创建一个作业,其中包含直接访问 Cloud Storage 存储分区的任务。如需了解如何直接从虚拟机访问 Cloud Storage 存储分区,请参阅 Compute Engine 文档中的对 Cloud Storage 存储分区执行数据读写操作部分。

在创建使用存储分区的作业之前,请创建存储分区或指定现有存储分区。如需了解详情,请参阅创建存储分区列出存储分区

您可以使用 Google Cloud 控制台、gcloud CLI、Batch API、C++、Go、Java、Node.js 或 Python 创建使用 Cloud Storage 存储分区的作业。

以下示例介绍了如何创建用于挂载 Cloud Storage 存储分区的作业。该作业还有 3 个任务,每个任务都会运行一个脚本,以便在名为 output_task_TASK_INDEX.txt 的存储分区中创建文件,其中 TASK_INDEX 是每个任务的索引:012

控制台

如需使用 Google Cloud 控制台创建使用 Cloud Storage 存储分区的作业,请执行以下操作:

  1. 在 Google Cloud 控制台中,前往 Job list(作业列表)页面。

    前往“作业列表”

  2. 点击 创建。系统随即会打开创建批处理作业页面。在左侧窗格中,选择作业详情页面。

  3. 配置作业详情页面:

    1. 可选:在作业名称字段中,自定义作业名称。

      例如,输入 example-bucket-job

    2. 配置任务详情部分:

      1. 新建可运行对象窗口中,添加至少一个脚本或容器以便此作业运行。

        例如,请执行以下操作:

        1. 选中脚本复选框。系统会显示一个文本框。

        2. 在文本框中,输入以下脚本:

          echo Hello world from task ${BATCH_TASK_INDEX}. >> MOUNT_PATH/output_task_${BATCH_TASK_INDEX}.txt
          

          MOUNT_PATH 替换为此作业的可运行文件使用来访问现有 Cloud Storage 存储分区的挂载路径。路径必须以 /mnt/disks/ 开头,后跟您选择的目录或路径。例如,如果您想使用名为 my-bucket 的目录表示此存储分区,请将挂载路径设置为 /mnt/disks/my-bucket

        3. 点击完成

      2. 任务数字段中,输入此作业的任务数量。

        例如,输入 3

      3. 并行性字段中,输入要并发运行的任务数量。

        例如,输入 1(默认值)。

  4. 配置 Additional configurations(其他配置)页面:

    1. 在左侧窗格中,点击其他配置。 系统会打开其他配置页面。

    2. 对于要挂载到此作业的每个 Cloud Storage 存储分区,请执行以下操作:

      1. 存储卷部分中,点击添加新卷。系统随即会显示新建卷窗口。

      2. 新建卷窗口中,执行以下操作:

        1. 卷类型部分,选择 Cloud Storage 存储分区

        2. 存储分区名称字段中,输入现有存储分区的名称。

          例如,输入您在此作业的可运行代码中指定的存储分区。

        3. 装载路径字段中,输入您在可运行文件中指定的存储分区 (MOUNT_PATH) 的装载路径。

        4. 点击完成

  5. 可选:配置此作业的其他字段

  6. 可选:如需查看作业配置,请在左侧窗格中点击预览

  7. 点击创建

作业详情页面会显示您创建的作业。

gcloud

如需使用 gcloud CLI 创建使用 Cloud Storage 存储分区的作业,请使用 gcloud batch jobs submit 命令。在作业的 JSON 配置文件中,在 volumes 字段中挂载存储分区。

例如,若要创建一个将文件输出到 Cloud Storage 的作业,请执行以下操作:

  1. 使用以下内容创建 JSON 文件:

    {
        "taskGroups": [
            {
                "taskSpec": {
                    "runnables": [
                        {
                            "script": {
                                "text": "echo Hello world from task ${BATCH_TASK_INDEX}. >> MOUNT_PATH/output_task_${BATCH_TASK_INDEX}.txt"
                            }
                        }
                    ],
                    "volumes": [
                        {
                            "gcs": {
                                "remotePath": "BUCKET_PATH"
                            },
                            "mountPath": "MOUNT_PATH"
                        }
                    ]
                },
                "taskCount": 3
            }
        ],
        "logsPolicy": {
            "destination": "CLOUD_LOGGING"
        }
    }
    

    替换以下内容:

    • BUCKET_PATH:您希望此作业访问的存储分区目录的路径,该路径必须以存储分区的名称开头。例如,对于名为 BUCKET_NAME 的存储分区,路径 BUCKET_NAME 表示存储分区的根目录,路径 BUCKET_NAME/subdirectory 表示 subdirectory 子目录。
    • MOUNT_PATH:作业的可运行文件使用来访问此存储分区的挂载路径。路径必须以 /mnt/disks/ 开头,后跟您选择的目录或路径。例如,如果您想使用名为 my-bucket 的目录表示此存储分区,请将装载路径设置为 /mnt/disks/my-bucket
  2. 运行以下命令:

    gcloud batch jobs submit JOB_NAME \
      --location LOCATION \
      --config JSON_CONFIGURATION_FILE
    

    替换以下内容:

    • JOB_NAME:作业的名称。
    • LOCATION:作业的位置
    • JSON_CONFIGURATION_FILE:包含作业配置详细信息的 JSON 文件的路径。

API

如需使用 Batch API 创建使用 Cloud Storage 存储分区的作业,请使用 jobs.create 方法并在 volumes 字段中挂载存储分区。

POST https://batch.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/jobs?job_id=JOB_NAME

{
    "taskGroups": [
        {
            "taskSpec": {
                "runnables": [
                    {
                        "script": {
                            "text": "echo Hello world from task ${BATCH_TASK_INDEX}. >> MOUNT_PATH/output_task_${BATCH_TASK_INDEX}.txt"
                        }
                    }
                ],
                "volumes": [
                    {
                        "gcs": {
                            "remotePath": "BUCKET_PATH"
                        },
                        "mountPath": "MOUNT_PATH"
                    }
                ]
            },
            "taskCount": 3
        }
    ],
    "logsPolicy": {
            "destination": "CLOUD_LOGGING"
    }
}

替换以下内容:

  • PROJECT_ID:您的项目的项目 ID
  • LOCATION:作业的位置
  • JOB_NAME:作业的名称。
  • BUCKET_PATH:您希望此作业访问的存储分区目录的路径,该路径必须以存储分区的名称开头。例如,对于名为 BUCKET_NAME 的存储分区,路径 BUCKET_NAME 表示存储分区的根目录,路径 BUCKET_NAME/subdirectory 表示 subdirectory 子目录。
  • MOUNT_PATH:作业的可运行文件使用来访问此存储分区的挂载路径。路径必须以 /mnt/disks/ 开头,后跟您选择的目录或路径。例如,如果您想使用名为 my-bucket 的目录表示此存储分区,请将挂载路径设置为 /mnt/disks/my-bucket

C++

C++

如需了解详情,请参阅 批处理 C++ API 参考文档

如需向 Batch 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

#include "google/cloud/batch/v1/batch_client.h"

  [](std::string const& project_id, std::string const& location_id,
     std::string const& job_id, std::string const& bucket_name) {
    // Initialize the request; start with the fields that depend on the sample
    // input.
    google::cloud::batch::v1::CreateJobRequest request;
    request.set_parent("projects/" + project_id + "/locations/" + location_id);
    request.set_job_id(job_id);
    // Most of the job description is fixed in this example; use a string to
    // initialize it, and then override the GCS remote path.
    auto constexpr kText = R"pb(
      task_groups {
        task_count: 4
        task_spec {
          compute_resource { cpu_milli: 500 memory_mib: 16 }
          max_retry_count: 2
          max_run_duration { seconds: 3600 }
          runnables {
            script {
              text: "echo Hello world from task ${BATCH_TASK_INDEX}. >> /mnt/share/output_task_${BATCH_TASK_INDEX}.txt"
            }
          }
          volumes { mount_path: "/mnt/share" }
        }
      }
      allocation_policy {
        instances {
          policy { machine_type: "e2-standard-4" provisioning_model: STANDARD }
        }
      }
      labels { key: "env" value: "testing" }
      labels { key: "type" value: "script" }
      logs_policy { destination: CLOUD_LOGGING }
    )pb";
    auto* job = request.mutable_job();
    if (!google::protobuf::TextFormat::ParseFromString(kText, job)) {
      throw std::runtime_error("Error parsing Job description");
    }
    job->mutable_task_groups(0)
        ->mutable_task_spec()
        ->mutable_volumes(0)
        ->mutable_gcs()
        ->set_remote_path(bucket_name);
    // Create a client and issue the request.
    auto client = google::cloud::batch_v1::BatchServiceClient(
        google::cloud::batch_v1::MakeBatchServiceConnection());
    auto response = client.CreateJob(request);
    if (!response) throw std::move(response).status();
    std::cout << "Job : " << response->DebugString() << "\n";
  }

Go

Go

如需了解详情,请参阅 批处理 Go API 参考文档

如需向 Batch 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

import (
	"context"
	"fmt"
	"io"

	batch "cloud.google.com/go/batch/apiv1"
	"cloud.google.com/go/batch/apiv1/batchpb"
	durationpb "google.golang.org/protobuf/types/known/durationpb"
)

// Creates and runs a job that executes the specified script
func createScriptJobWithBucket(w io.Writer, projectID, region, jobName, bucketName string) error {
	// projectID := "your_project_id"
	// region := "us-central1"
	// jobName := "some-job"
	// jobName := "some-bucket"

	ctx := context.Background()
	batchClient, err := batch.NewClient(ctx)
	if err != nil {
		return fmt.Errorf("NewClient: %w", err)
	}
	defer batchClient.Close()

	// Define what will be done as part of the job.
	command := &batchpb.Runnable_Script_Text{
		Text: "echo Hello world from task ${BATCH_TASK_INDEX}. >> /mnt/share/output_task_${BATCH_TASK_INDEX}.txt",
	}

	// Specify the Google Cloud Storage bucket to mount
	volume := &batchpb.Volume{
		Source: &batchpb.Volume_Gcs{
			Gcs: &batchpb.GCS{
				RemotePath: bucketName,
			},
		},
		MountPath:    "/mnt/share",
		MountOptions: []string{},
	}

	// We can specify what resources are requested by each task.
	resources := &batchpb.ComputeResource{
		// CpuMilli is milliseconds per cpu-second. This means the task requires 50% of a single CPUs.
		CpuMilli:  500,
		MemoryMib: 16,
	}

	taskSpec := &batchpb.TaskSpec{
		Runnables: []*batchpb.Runnable{{
			Executable: &batchpb.Runnable_Script_{
				Script: &batchpb.Runnable_Script{Command: command},
			},
		}},
		ComputeResource: resources,
		MaxRunDuration: &durationpb.Duration{
			Seconds: 3600,
		},
		MaxRetryCount: 2,
		Volumes:       []*batchpb.Volume{volume},
	}

	// Tasks are grouped inside a job using TaskGroups.
	taskGroups := []*batchpb.TaskGroup{
		{
			TaskCount: 4,
			TaskSpec:  taskSpec,
		},
	}

	// Policies are used to define on what kind of virtual machines the tasks will run on.
	// In this case, we tell the system to use "e2-standard-4" machine type.
	// Read more about machine types here: https://cloud.google.com/compute/docs/machine-types
	allocationPolicy := &batchpb.AllocationPolicy{
		Instances: []*batchpb.AllocationPolicy_InstancePolicyOrTemplate{{
			PolicyTemplate: &batchpb.AllocationPolicy_InstancePolicyOrTemplate_Policy{
				Policy: &batchpb.AllocationPolicy_InstancePolicy{
					MachineType: "e2-standard-4",
				},
			},
		}},
	}

	// We use Cloud Logging as it's an out of the box available option
	logsPolicy := &batchpb.LogsPolicy{
		Destination: batchpb.LogsPolicy_CLOUD_LOGGING,
	}

	jobLabels := map[string]string{"env": "testing", "type": "script"}

	// The job's parent is the region in which the job will run
	parent := fmt.Sprintf("projects/%s/locations/%s", projectID, region)

	job := batchpb.Job{
		TaskGroups:       taskGroups,
		AllocationPolicy: allocationPolicy,
		Labels:           jobLabels,
		LogsPolicy:       logsPolicy,
	}

	req := &batchpb.CreateJobRequest{
		Parent: parent,
		JobId:  jobName,
		Job:    &job,
	}

	created_job, err := batchClient.CreateJob(ctx, req)
	if err != nil {
		return fmt.Errorf("unable to create job: %w", err)
	}

	fmt.Fprintf(w, "Job created: %v\n", created_job)

	return nil
}

Java

Java

如需了解详情,请参阅 批处理 Java API 参考文档

如需向 Batch 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

import com.google.cloud.batch.v1.AllocationPolicy;
import com.google.cloud.batch.v1.AllocationPolicy.InstancePolicy;
import com.google.cloud.batch.v1.AllocationPolicy.InstancePolicyOrTemplate;
import com.google.cloud.batch.v1.BatchServiceClient;
import com.google.cloud.batch.v1.ComputeResource;
import com.google.cloud.batch.v1.CreateJobRequest;
import com.google.cloud.batch.v1.GCS;
import com.google.cloud.batch.v1.Job;
import com.google.cloud.batch.v1.LogsPolicy;
import com.google.cloud.batch.v1.LogsPolicy.Destination;
import com.google.cloud.batch.v1.Runnable;
import com.google.cloud.batch.v1.Runnable.Script;
import com.google.cloud.batch.v1.TaskGroup;
import com.google.cloud.batch.v1.TaskSpec;
import com.google.cloud.batch.v1.Volume;
import com.google.protobuf.Duration;
import java.io.IOException;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;

public class CreateWithMountedBucket {

  public static void main(String[] args)
      throws IOException, ExecutionException, InterruptedException, TimeoutException {
    // TODO(developer): Replace these variables before running the sample.
    // Project ID or project number of the Cloud project you want to use.
    String projectId = "YOUR_PROJECT_ID";

    // Name of the region you want to use to run the job. Regions that are
    // available for Batch are listed on: https://cloud.google.com/batch/docs/get-started#locations
    String region = "europe-central2";

    // The name of the job that will be created.
    // It needs to be unique for each project and region pair.
    String jobName = "JOB_NAME";

    // Name of the bucket to be mounted for your Job.
    String bucketName = "BUCKET_NAME";

    createScriptJobWithBucket(projectId, region, jobName, bucketName);
  }

  // This method shows how to create a sample Batch Job that will run
  // a simple command on Cloud Compute instances.
  public static void createScriptJobWithBucket(String projectId, String region, String jobName,
      String bucketName)
      throws IOException, ExecutionException, InterruptedException, TimeoutException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the `batchServiceClient.close()` method on the client to safely
    // clean up any remaining background resources.
    try (BatchServiceClient batchServiceClient = BatchServiceClient.create()) {

      // Define what will be done as part of the job.
      Runnable runnable =
          Runnable.newBuilder()
              .setScript(
                  Script.newBuilder()
                      .setText(
                          "echo Hello world from task ${BATCH_TASK_INDEX}. >> "
                              + "/mnt/share/output_task_${BATCH_TASK_INDEX}.txt")
                      // You can also run a script from a file. Just remember, that needs to be a
                      // script that's already on the VM that will be running the job.
                      // Using setText() and setPath() is mutually exclusive.
                      // .setPath("/tmp/test.sh")
                      .build())
              .build();

      Volume volume = Volume.newBuilder()
          .setGcs(GCS.newBuilder()
              .setRemotePath(bucketName)
              .build())
          .setMountPath("/mnt/share")
          .build();

      // We can specify what resources are requested by each task.
      ComputeResource computeResource =
          ComputeResource.newBuilder()
              // In milliseconds per cpu-second. This means the task requires 50% of a single CPUs.
              .setCpuMilli(500)
              // In MiB.
              .setMemoryMib(16)
              .build();

      TaskSpec task =
          TaskSpec.newBuilder()
              // Jobs can be divided into tasks. In this case, we have only one task.
              .addRunnables(runnable)
              .addVolumes(volume)
              .setComputeResource(computeResource)
              .setMaxRetryCount(2)
              .setMaxRunDuration(Duration.newBuilder().setSeconds(3600).build())
              .build();

      // Tasks are grouped inside a job using TaskGroups.
      // Currently, it's possible to have only one task group.
      TaskGroup taskGroup = TaskGroup.newBuilder().setTaskCount(4).setTaskSpec(task).build();

      // Policies are used to define on what kind of virtual machines the tasks will run on.
      // In this case, we tell the system to use "e2-standard-4" machine type.
      // Read more about machine types here: https://cloud.google.com/compute/docs/machine-types
      InstancePolicy instancePolicy =
          InstancePolicy.newBuilder().setMachineType("e2-standard-4").build();

      AllocationPolicy allocationPolicy =
          AllocationPolicy.newBuilder()
              .addInstances(InstancePolicyOrTemplate.newBuilder().setPolicy(instancePolicy).build())
              .build();

      Job job =
          Job.newBuilder()
              .addTaskGroups(taskGroup)
              .setAllocationPolicy(allocationPolicy)
              .putLabels("env", "testing")
              .putLabels("type", "script")
              .putLabels("mount", "bucket")
              // We use Cloud Logging as it's an out of the box available option.
              .setLogsPolicy(
                  LogsPolicy.newBuilder().setDestination(Destination.CLOUD_LOGGING).build())
              .build();

      CreateJobRequest createJobRequest =
          CreateJobRequest.newBuilder()
              // The job's parent is the region in which the job will run.
              .setParent(String.format("projects/%s/locations/%s", projectId, region))
              .setJob(job)
              .setJobId(jobName)
              .build();

      Job result =
          batchServiceClient
              .createJobCallable()
              .futureCall(createJobRequest)
              .get(5, TimeUnit.MINUTES);

      System.out.printf("Successfully created the job: %s", result.getName());
    }
  }
}

Node.js

Node.js

如需了解详情,请参阅 批处理 Node.js API 参考文档

如需向 Batch 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

/**
 * TODO(developer): Uncomment and replace these variables before running the sample.
 */
// const projectId = 'YOUR_PROJECT_ID';
/**
 * The region you want to the job to run in. The regions that support Batch are listed here:
 * https://cloud.google.com/batch/docs/get-started#locations
 */
// const region = 'us-central-1';
/**
 * The name of the job that will be created.
 * It needs to be unique for each project and region pair.
 */
// const jobName = 'YOUR_JOB_NAME';
/**
 * The name of the bucket to be mounted.
 */
// const bucketName = 'YOUR_BUCKET_NAME';

// Imports the Batch library
const batchLib = require('@google-cloud/batch');
const batch = batchLib.protos.google.cloud.batch.v1;

// Instantiates a client
const batchClient = new batchLib.v1.BatchServiceClient();

// Define what will be done as part of the job.
const task = new batch.TaskSpec();
const runnable = new batch.Runnable();
runnable.script = new batch.Runnable.Script();
runnable.script.text =
  'echo Hello world from task ${BATCH_TASK_INDEX}. >> /mnt/share/output_task_${BATCH_TASK_INDEX}.txt';
// You can also run a script from a file. Just remember, that needs to be a script that's
// already on the VM that will be running the job. Using runnable.script.text and runnable.script.path is mutually
// exclusive.
// runnable.script.path = '/tmp/test.sh'
task.runnables = [runnable];

const gcsBucket = new batch.GCS();
gcsBucket.remotePath = bucketName;
const gcsVolume = new batch.Volume();
gcsVolume.gcs = gcsBucket;
gcsVolume.mountPath = '/mnt/share';
task.volumes = [gcsVolume];

// We can specify what resources are requested by each task.
const resources = new batch.ComputeResource();
resources.cpuMilli = 2000; // in milliseconds per cpu-second. This means the task requires 2 whole CPUs.
resources.memoryMib = 16;
task.computeResource = resources;

task.maxRetryCount = 2;
task.maxRunDuration = {seconds: 3600};

// Tasks are grouped inside a job using TaskGroups.
const group = new batch.TaskGroup();
group.taskCount = 4;
group.taskSpec = task;

// Policies are used to define on what kind of virtual machines the tasks will run on.
// In this case, we tell the system to use "e2-standard-4" machine type.
// Read more about machine types here: https://cloud.google.com/compute/docs/machine-types
const allocationPolicy = new batch.AllocationPolicy();
const policy = new batch.AllocationPolicy.InstancePolicy();
policy.machineType = 'e2-standard-4';
const instances = new batch.AllocationPolicy.InstancePolicyOrTemplate();
instances.policy = policy;
allocationPolicy.instances = [instances];

const job = new batch.Job();
job.name = jobName;
job.taskGroups = [group];
job.allocationPolicy = allocationPolicy;
job.labels = {env: 'testing', type: 'script'};
// We use Cloud Logging as it's an option available out of the box
job.logsPolicy = new batch.LogsPolicy();
job.logsPolicy.destination = batch.LogsPolicy.Destination.CLOUD_LOGGING;

// The job's parent is the project and region in which the job will run
const parent = `projects/${projectId}/locations/${region}`;

async function callCreateJob() {
  // Construct request
  const request = {
    parent,
    jobId: jobName,
    job,
  };

  // Run request
  const response = await batchClient.createJob(request);
  console.log(response);
}

await callCreateJob();

Python

Python

如需了解详情,请参阅 批处理 Python API 参考文档

如需向 Batch 进行身份验证,请设置应用默认凭据。 如需了解详情,请参阅为本地开发环境设置身份验证

from google.cloud import batch_v1


def create_script_job_with_bucket(
    project_id: str, region: str, job_name: str, bucket_name: str
) -> batch_v1.Job:
    """
    This method shows how to create a sample Batch Job that will run
    a simple command on Cloud Compute instances.

    Args:
        project_id: project ID or project number of the Cloud project you want to use.
        region: name of the region you want to use to run the job. Regions that are
            available for Batch are listed on: https://cloud.google.com/batch/docs/get-started#locations
        job_name: the name of the job that will be created.
            It needs to be unique for each project and region pair.
        bucket_name: name of the bucket to be mounted for your Job.

    Returns:
        A job object representing the job created.
    """
    client = batch_v1.BatchServiceClient()

    # Define what will be done as part of the job.
    task = batch_v1.TaskSpec()
    runnable = batch_v1.Runnable()
    runnable.script = batch_v1.Runnable.Script()
    runnable.script.text = "echo Hello world from task ${BATCH_TASK_INDEX}. >> /mnt/share/output_task_${BATCH_TASK_INDEX}.txt"
    task.runnables = [runnable]

    gcs_bucket = batch_v1.GCS()
    gcs_bucket.remote_path = bucket_name
    gcs_volume = batch_v1.Volume()
    gcs_volume.gcs = gcs_bucket
    gcs_volume.mount_path = "/mnt/share"
    task.volumes = [gcs_volume]

    # We can specify what resources are requested by each task.
    resources = batch_v1.ComputeResource()
    resources.cpu_milli = 500  # in milliseconds per cpu-second. This means the task requires 50% of a single CPUs.
    resources.memory_mib = 16
    task.compute_resource = resources

    task.max_retry_count = 2
    task.max_run_duration = "3600s"

    # Tasks are grouped inside a job using TaskGroups.
    # Currently, it's possible to have only one task group.
    group = batch_v1.TaskGroup()
    group.task_count = 4
    group.task_spec = task

    # Policies are used to define on what kind of virtual machines the tasks will run on.
    # In this case, we tell the system to use "e2-standard-4" machine type.
    # Read more about machine types here: https://cloud.google.com/compute/docs/machine-types
    allocation_policy = batch_v1.AllocationPolicy()
    policy = batch_v1.AllocationPolicy.InstancePolicy()
    policy.machine_type = "e2-standard-4"
    instances = batch_v1.AllocationPolicy.InstancePolicyOrTemplate()
    instances.policy = policy
    allocation_policy.instances = [instances]

    job = batch_v1.Job()
    job.task_groups = [group]
    job.allocation_policy = allocation_policy
    job.labels = {"env": "testing", "type": "script", "mount": "bucket"}
    # We use Cloud Logging as it's an out of the box available option
    job.logs_policy = batch_v1.LogsPolicy()
    job.logs_policy.destination = batch_v1.LogsPolicy.Destination.CLOUD_LOGGING

    create_request = batch_v1.CreateJobRequest()
    create_request.job = job
    create_request.job_id = job_name
    # The job's parent is the region in which the job will run
    create_request.parent = f"projects/{project_id}/locations/{region}"

    return client.create_job(create_request)

使用网络文件系统

您可以使用 Google Cloud 控制台、gcloud CLI 或 Batch API 创建使用现有网络文件系统 (NFS)(例如 Filestore 文件共享)的作业。

在创建使用 NFS 的作业之前,请确保网络的防火墙已正确配置,以允许作业的虚拟机与 NFS 之间传输流量。如需了解详情,请参阅为 Filestore 配置防火墙规则

以下示例介绍了如何创建用于指定和挂载 NFS 的作业。该作业还有 3 个任务,每个任务都会运行一个脚本,以便在名为 output_task_TASK_INDEX.txt 的 NFS 中创建文件,其中 TASK_INDEX 是每个任务的编号:012

控制台

如需使用 Google Cloud 控制台创建使用 NFS 的作业,请执行以下操作:

  1. 在 Google Cloud 控制台中,前往 Job list(作业列表)页面。

    前往“作业列表”

  2. 点击 创建。系统随即会打开创建批处理作业页面。在左侧窗格中,选择作业详情页面。

  3. 配置作业详情页面:

    1. 可选:在作业名称字段中,自定义作业名称。

      例如,输入 example-nfs-job

    2. 配置任务详情部分:

      1. 新建可运行对象窗口中,添加至少一个脚本或容器以便此作业运行。

        例如,请执行以下操作:

        1. 选中脚本复选框。系统会显示一个文本框。

        2. 在文本框中,输入以下脚本:

          echo Hello world from task ${BATCH_TASK_INDEX}. >> MOUNT_PATH/output_task_${BATCH_TASK_INDEX}.txt
          

          MOUNT_PATH 替换为作业的可运行文件使用来访问此 NFS 的挂载路径。路径必须以 /mnt/disks/ 开头,后跟您选择的目录或路径。例如,如果您想使用名为 my-nfs 的目录表示此 NFS,请将挂载路径设置为 /mnt/disks/my-nfs

        3. 点击完成

      2. 任务数字段中,输入此作业的任务数量。

        例如,输入 3

      3. 并行性字段中,输入要并发运行的任务数量。

        例如,输入 1(默认值)。

  4. 配置 Additional configurations(其他配置)页面:

    1. 在左侧窗格中,点击其他配置。 系统会打开其他配置页面。

    2. 对于要挂载到此作业的每个 Cloud Storage 存储分区,请执行以下操作:

      1. 存储卷部分中,点击添加新卷。系统随即会显示新建卷窗口。

      2. 新建卷窗口中,执行以下操作:

        1. 卷类型部分中,选择网络文件系统

        2. File server 字段中,输入您在此作业的可运行文件中指定的 NFS 所在服务器的 IP 地址。

          例如,如果您的 NFS 是 Filestore 文件共享,请指定 Filestore 实例的 IP 地址,您可以通过描述 Filestore 实例来获取该地址。

        3. 远程路径字段中,输入一个可以访问您在上一步中指定的 NFS 的路径。

          NFS 目录的路径必须以 / 开头,后跟 NFS 的根目录。

        4. 装载路径字段中,输入您在上一步中指定的 NFS (MOUNT_PATH) 的装载路径。

    3. 点击完成

  5. 可选:配置此作业的其他字段

  6. 可选:如需查看作业配置,请在左侧窗格中点击预览

  7. 点击创建

作业详情页面会显示您创建的作业。

gcloud

如需使用 gcloud CLI 创建使用 NFS 的作业,请使用 gcloud batch jobs submit 命令。在作业的 JSON 配置文件中,在 volumes 字段中挂载 NFS。

  1. 使用以下内容创建 JSON 文件:

    {
        "taskGroups": [
            {
                "taskSpec": {
                    "runnables": [
                        {
                            "script": {
                                "text": "echo Hello world from task ${BATCH_TASK_INDEX}. >> MOUNT_PATH/output_task_${BATCH_TASK_INDEX}.txt"
                            }
                        }
                    ],
                    "volumes": [
                        {
                            "nfs": {
                                "server": "NFS_IP_ADDRESS",
                                "remotePath": "NFS_PATH"
                            },
                            "mountPath": "MOUNT_PATH"
                        }
                    ]
                },
                "taskCount": 3
            }
        ],
        "logsPolicy": {
            "destination": "CLOUD_LOGGING"
        }
    }
    

    替换以下内容:

    • NFS_IP_ADDRESS:NFS 的 IP 地址。例如,如果您的 NFS 是 Filestore 文件共享,请指定 Filestore 实例的 IP 地址,您可以通过描述 Filestore 实例来获取该地址。
    • NFS_PATH:您希望此作业访问的 NFS 目录的路径,必须以 / 开头,后跟 NFS 的根目录。例如,对于名为 FILE_SHARE_NAME 的 Filestore 文件共享,路径 /FILE_SHARE_NAME 表示文件共享的根目录,路径 /FILE_SHARE_NAME/subdirectory 表示 subdirectory 子目录。
    • MOUNT_PATH:作业的可运行文件使用来访问此 NFS 的挂载路径。路径必须以 /mnt/disks/ 开头,后跟您选择的目录或路径。例如,如果您想使用名为 my-nfs 的目录表示此 NFS,请将挂载路径设置为 /mnt/disks/my-nfs
  2. 运行以下命令:

    gcloud batch jobs submit JOB_NAME \
      --location LOCATION \
      --config JSON_CONFIGURATION_FILE
    

    替换以下内容:

    • JOB_NAME:作业的名称。
    • LOCATION:作业的位置
    • JSON_CONFIGURATION_FILE:包含作业配置详细信息的 JSON 文件的路径。

API

如需使用批处理 API 创建使用 NFS 的作业,请使用 jobs.create 方法,并在 volumes 字段中挂载 NFS。

POST https://batch.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/jobs?job_id=JOB_NAME

   {
    "taskGroups": [
        {
            "taskSpec": {
                "runnables": [
                    {
                        "script": {
                            "text": "echo Hello world from task ${BATCH_TASK_INDEX}. >> MOUNT_PATH/output_task_${BATCH_TASK_INDEX}.txt"
                        }
                    }
                ],
                "volumes": [
                    {
                        "nfs": {
                            "server": "NFS_IP_ADDRESS",
                            "remotePath": "NFS_PATH"
                        },
                        "mountPath": "MOUNT_PATH"
                    }
                ]
            },
            "taskCount": 3
        }
    ],
    "logsPolicy": {
        "destination": "CLOUD_LOGGING"
    }
}

替换以下内容:

  • PROJECT_ID:您的项目的项目 ID
  • LOCATION:作业的位置
  • JOB_NAME:作业的名称。
  • NFS_IP_ADDRESS:网络文件系统的 IP 地址。例如,如果您的 NFS 是 Filestore 文件共享,请指定 Filestore 实例的 IP 地址,您可以通过描述 Filestore 实例来获取该地址。
  • NFS_PATH:您希望此作业访问的 NFS 目录的路径,必须以 / 开头,后跟 NFS 的根目录。例如,对于名为 FILE_SHARE_NAME 的 Filestore 文件共享,路径 /FILE_SHARE_NAME 表示文件共享的根目录,路径 /FILE_SHARE_NAME/subdirectory 表示子目录。
  • MOUNT_PATH:作业的可运行文件使用来访问此 NFS 的挂载路径。路径必须以 /mnt/disks/ 开头,后跟您选择的目录或路径。例如,如果您想使用名为 my-nfs 的目录表示此 NFS,请将挂载路径设置为 /mnt/disks/my-nfs

Java


import com.google.cloud.batch.v1.AllocationPolicy;
import com.google.cloud.batch.v1.BatchServiceClient;
import com.google.cloud.batch.v1.ComputeResource;
import com.google.cloud.batch.v1.CreateJobRequest;
import com.google.cloud.batch.v1.Job;
import com.google.cloud.batch.v1.LogsPolicy;
import com.google.cloud.batch.v1.NFS;
import com.google.cloud.batch.v1.Runnable;
import com.google.cloud.batch.v1.TaskGroup;
import com.google.cloud.batch.v1.TaskSpec;
import com.google.cloud.batch.v1.Volume;
import com.google.protobuf.Duration;
import java.io.IOException;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.TimeoutException;

public class CreateScriptJobWithNfs {

  public static void main(String[] args)
      throws IOException, ExecutionException, InterruptedException, TimeoutException {
    // TODO(developer): Replace these variables before running the sample.
    // Project ID or project number of the Cloud project you want to use.
    String projectId = "YOUR_PROJECT_ID";

    // Name of the region you want to use to run the job. Regions that are
    // available for Batch are listed on: https://cloud.google.com/batch/docs/get-started#locations
    String region = "europe-central2";

    // The name of the job that will be created.
    // It needs to be unique for each project and region pair.
    String jobName = "JOB_NAME";

    // The path of the NFS directory that you want this job to access.
    String nfsPath = "NFS_PATH";
    // The IP address of the Network File System.
    String nfsIpAddress = "NFS_IP_ADDRESS";

    createScriptJobWithNfs(projectId, region, jobName, nfsPath, nfsIpAddress);
  }

  // This method shows how to create a batch script job that specifies and mounts a NFS.
  public static Job createScriptJobWithNfs(String projectId, String region, String jobName,
                                            String nfsPath, String nfsIpAddress)
      throws IOException, ExecutionException, InterruptedException, TimeoutException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests.
    try (BatchServiceClient batchServiceClient = BatchServiceClient.create()) {

      // Define what will be done as part of the job.
      Runnable runnable =
          Runnable.newBuilder()
              .setScript(
                  Runnable.Script.newBuilder()
                      .setText(
                          "echo Hello world from task ${BATCH_TASK_INDEX}. >> "
                              + "/mnt/share/output_task_${BATCH_TASK_INDEX}.txt")
                      // You can also run a script from a file. Just remember, that needs to be a
                      // script that's already on the VM that will be running the job.
                      // Using setText() and setPath() is mutually exclusive.
                      // .setPath("/tmp/test.sh")
                      .build())
              .build();

      // Describes a volume and parameters for it to be mounted to a VM.
      Volume volume = Volume.newBuilder()
          .setNfs(NFS.newBuilder()
              .setServer(nfsIpAddress)
              .setRemotePath(nfsPath)
              .build())
          .setMountPath("/mnt/share")
          .build();

      // We can specify what resources are requested by each task.
      ComputeResource computeResource =
          ComputeResource.newBuilder()
              // In milliseconds per cpu-second. This means the task requires 50% of a single CPUs.
              .setCpuMilli(500)
              // In MiB.
              .setMemoryMib(16)
              .build();

      TaskSpec task =
          TaskSpec.newBuilder()
              // Jobs can be divided into tasks. In this case, we have only one task.
              .addRunnables(runnable)
              .addVolumes(volume)
              .setComputeResource(computeResource)
              .setMaxRetryCount(2)
              .setMaxRunDuration(Duration.newBuilder().setSeconds(3600).build())
              .build();

      // Tasks are grouped inside a job using TaskGroups.
      // Currently, it's possible to have only one task group.
      TaskGroup taskGroup = TaskGroup.newBuilder().setTaskCount(4).setTaskSpec(task).build();

      // Policies are used to define on what kind of virtual machines the tasks will run on.
      // In this case, we tell the system to use "e2-standard-4" machine type.
      // Read more about machine types here:
      // https://cloud.google.com/compute/docs/machine-types
      AllocationPolicy.InstancePolicy instancePolicy =
          AllocationPolicy.InstancePolicy.newBuilder().setMachineType("e2-standard-4").build();

      AllocationPolicy allocationPolicy =
          AllocationPolicy.newBuilder()
              .addInstances(AllocationPolicy.InstancePolicyOrTemplate.newBuilder()
                      .setPolicy(instancePolicy).build())
              .build();

      Job job =
          Job.newBuilder()
              .addTaskGroups(taskGroup)
              .setAllocationPolicy(allocationPolicy)
              .putLabels("env", "testing")
              .putLabels("type", "script")
              .putLabels("mount", "bucket")
              // We use Cloud Logging as it's an out of the box available option.
              .setLogsPolicy(LogsPolicy.newBuilder()
                      .setDestination(LogsPolicy.Destination.CLOUD_LOGGING).build())
              .build();

      CreateJobRequest createJobRequest =
          CreateJobRequest.newBuilder()
              // The job's parent is the region in which the job will run.
              .setParent(String.format("projects/%s/locations/%s", projectId, region))
              .setJob(job)
              .setJobId(jobName)
              .build();

      Job result =
          batchServiceClient
              .createJobCallable()
              .futureCall(createJobRequest)
              .get(5, TimeUnit.MINUTES);

      System.out.printf("Successfully created the job: %s", result.getName());

      return result;
    }
  }
}

Node.js

// Imports the Batch library
const batchLib = require('@google-cloud/batch');
const batch = batchLib.protos.google.cloud.batch.v1;

// Instantiates a client
const batchClient = new batchLib.v1.BatchServiceClient();

/**
 * TODO(developer): Update these variables before running the sample.
 */
// Project ID or project number of the Google Cloud project you want to use.
const projectId = await batchClient.getProjectId();
// Name of the region you want to use to run the job. Regions that are
// available for Batch are listed on: https://cloud.google.com/batch/docs/get-started#locations
const region = 'europe-central2';
// The name of the job that will be created.
// It needs to be unique for each project and region pair.
const jobName = 'batch-nfs-job';
// The path of the NFS directory that you want this job to access.
const nfsPath = '/your_nfs_path';
// The IP address of the Network File System.
const nfsIpAddress = '0.0.0.0';
// The mount path that the job's tasks use to access the NFS.
const mountPath = '/mnt/disks';

// Define what will be done as part of the job.
const runnable = new batch.Runnable({
  script: new batch.Runnable.Script({
    commands: [
      '-c',
      'echo Hello world from task ${BATCH_TASK_INDEX}. >> ' +
        '/mnt/share/output_task_${BATCH_TASK_INDEX}.txt',
    ],
  }),
});

// Define a volume that uses NFS.
const volume = new batch.Volume({
  nfs: new batch.NFS({
    server: nfsIpAddress,
    remotePath: nfsPath,
  }),
  mountPath,
});

// Specify what resources are requested by each task.
const computeResource = new batch.ComputeResource({
  // In milliseconds per cpu-second. This means the task requires 50% of a single CPUs.
  cpuMilli: 500,
  // In MiB.
  memoryMib: 16,
});

const task = new batch.TaskSpec({
  runnables: [runnable],
  volumes: [volume],
  computeResource,
  maxRetryCount: 2,
  maxRunDuration: {seconds: 3600},
});

// Tasks are grouped inside a job using TaskGroups.
const group = new batch.TaskGroup({
  taskCount: 3,
  taskSpec: task,
});

// Policies are used to define on what kind of virtual machines the tasks will run on.
// In this case, we tell the system to use "e2-standard-4" machine type.
// Read more about machine types here: https://cloud.google.com/compute/docs/machine-types
const instancePolicy = new batch.AllocationPolicy.InstancePolicy({
  machineType: 'e2-standard-4',
});

const allocationPolicy = new batch.AllocationPolicy.InstancePolicyOrTemplate({
  instances: [{policy: instancePolicy}],
});

const job = new batch.Job({
  name: jobName,
  taskGroups: [group],
  allocationPolicy,
  labels: {env: 'testing', type: 'script'},
  // We use Cloud Logging as it's an option available out of the box
  logsPolicy: new batch.LogsPolicy({
    destination: batch.LogsPolicy.Destination.CLOUD_LOGGING,
  }),
});

// The job's parent is the project and region in which the job will run
const parent = `projects/${projectId}/locations/${region}`;

async function callCreateBatchNfsJob() {
  // Construct request
  const request = {
    parent,
    jobId: jobName,
    job,
  };

  // Run request
  const [response] = await batchClient.createJob(request);
  console.log(JSON.stringify(response));
}

await callCreateBatchNfsJob();

Python

from google.cloud import batch_v1


def create_job_with_network_file_system(
    project_id: str,
    region: str,
    job_name: str,
    mount_path: str,
    nfs_ip_address: str,
    nfs_path: str,
) -> batch_v1.Job:
    """
    Creates a Batch job with status events that mounts a Network File System (NFS).
    Function mounts an NFS volume using the provided NFS server, IP address and path.

    Args:
        project_id (str): project ID or project number of the Cloud project you want to use.
        region (str): name of the region you want to use to run the job. Regions that are
            available for Batch are listed on: https://cloud.google.com/batch/docs/locations
        job_name (str): the name of the job that will be created.
            It needs to be unique for each project and region pair.
        mount_path (str): The mount path that the job's tasks use to access the NFS.
        nfs_ip_address (str): The IP address of the NFS server (e.g., Filestore instance).
            Documentation on how to create a
            Filestore instance is available here: https://cloud.google.com/filestore/docs/create-instance-gcloud
        nfs_path (str): The path of the NFS directory that the job accesses.
            The path must start with a / followed by the root directory of the NFS.

    Returns:
        batch_v1.Job: The created Batch job object containing configuration details.
    """
    client = batch_v1.BatchServiceClient()

    # Create a runnable with a script that writes a message to a file
    runnable = batch_v1.Runnable()
    runnable.script = batch_v1.Runnable.Script()
    runnable.script.text = f"echo Hello world from task ${{BATCH_TASK_INDEX}}. >> {mount_path}/output_task_${{BATCH_TASK_INDEX}}.txt"

    # Define a volume that uses NFS
    volume = batch_v1.Volume()
    volume.nfs = batch_v1.NFS(server=nfs_ip_address, remote_path=nfs_path)
    volume.mount_path = mount_path

    # Create a task specification and assign the runnable and volume to it
    task = batch_v1.TaskSpec()
    task.runnables = [runnable]
    task.volumes = [volume]

    # Specify what resources are requested by each task.
    resources = batch_v1.ComputeResource()
    resources.cpu_milli = 2000  # in milliseconds per cpu-second. This means the task requires 2 whole CPUs.
    resources.memory_mib = 16  # in MiB
    task.compute_resource = resources

    task.max_retry_count = 2
    task.max_run_duration = "3600s"

    # Create a task group and assign the task specification to it
    group = batch_v1.TaskGroup()
    group.task_count = 1
    group.task_spec = task

    # Policies are used to define on what kind of virtual machines the tasks will run on.
    # In this case, we tell the system to use "e2-standard-4" machine type.
    # Read more about machine types here: https://cloud.google.com/compute/docs/machine-types
    policy = batch_v1.AllocationPolicy.InstancePolicy()
    policy.machine_type = "e2-standard-4"
    instances = batch_v1.AllocationPolicy.InstancePolicyOrTemplate()
    instances.policy = policy
    allocation_policy = batch_v1.AllocationPolicy()
    allocation_policy.instances = [instances]

    # Create the job and assign the task group and allocation policy to it
    job = batch_v1.Job()
    job.task_groups = [group]
    job.allocation_policy = allocation_policy
    job.labels = {"env": "testing", "type": "container"}
    # We use Cloud Logging as it's an out of the box available option
    job.logs_policy = batch_v1.LogsPolicy()
    job.logs_policy.destination = batch_v1.LogsPolicy.Destination.CLOUD_LOGGING

    # Create the job request and set the job and job ID
    create_request = batch_v1.CreateJobRequest()
    create_request.job = job
    create_request.job_id = job_name
    # The job's parent is the region in which the job will run
    create_request.parent = f"projects/{project_id}/locations/{region}"

    return client.create_job(create_request)

后续步骤