[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-09-04。"],[],[],null,["# Deploy GPU container workloads\n\nThis page describes how to deploy GPU container workloads on the\nGoogle Distributed Cloud (GDC) Sandbox AI Optimized SKU.\n\nDeploy GPU container workloads\n------------------------------\n\nThe GDC Sandbox AI Optimized SKU includes four NVIDIA H100 80GB HBM3 GPUs within\nthe org-infra cluster. These GPUs are accessible using the resource name\n`nvidia.com/gpu-pod-NVIDIA_H100_80GB_HBM3`. This section describes how to update\na container configuration to use these GPUS.\n\nThe GPUs in GDC Sandbox AI Optimized SKU are associated with a pre-configured\nproject, \"**sandbox-gpu-project**\". You must deploy your container using this\nproject in order to make use of the GPUs.\n| **Note:** For GPU workloads, there is no direct access to HaaS (Harbor as a Service). You must be able to fetch images from the Google Cloud Artifact Registry or internet.\n\n### Before you begin\n\n- To run commands against the org infrastructure cluster, make sure that you\n have the kubeconfig of the `org-1-infra` cluster, as described in\n [Work with clusters](/distributed-cloud/sandbox/latest/clusters#org-infra-cluster):\n\n - Configure and authenticate with the `gdcloud` command line, and\n - generate the kubeconfig file for the org infrastructure cluster, and assign its path to the environment variable `KUBECONFIG`.\n- To run the workloads, you must have the `sandbox-gpu-admin` role assigned.\n By default, the role is assigned to the `platform-admin` user. You can\n assign the role to other users by signing in as the `platform-admin` and\n running the following command:\n\n kubectl --kubeconfig ${KUBECONFIG} create rolebinding ${NAME} --role=sandbox-gpu-admin \\\n --user=${USER} --namespace=sandbox-gpu-project\n\n### Configure a container to use GPU resources\n\n1. Add the `.containers.resources.requests` and `.containers.resources.limits`\n fields to your container specification to request GPUs for the workload. All\n containers within the sandbox-gpu-project can request up to a total of 4\n GPUs across the entire project. The following example requests one GPU as\n part of the container specification.\n\n apiVersion: apps/v1\n kind: Deployment\n metadata:\n name: nginx-deployment\n namespace: sandbox-gpu-project\n labels:\n app: nginx\n spec:\n replicas: 1\n selector:\n matchLabels:\n app: nginx\n template:\n metadata:\n labels:\n app: nginx\n spec:\n containers:\n - name: nginx\n image: nginx:latest\n resources:\n requests:\n nvidia.com/gpu-pod-NVIDIA_H100_80GB_HBM3: 1\n limits:\n nvidia.com/gpu-pod-NVIDIA_H100_80GB_HBM3: 1\n\n| **Note:** If you are using GDC Sandbox AI Optimized with A100 GPUs, the GPUs are accessible using the resource name `nvidia.com/gpu-pod-NVIDIA_A100_SXM4_80GB`. Substitute this resource name for `nvidia.com/gpu-pod-NVIDIA_H100_80GB_HBM3` in the configuration file.\n\n1. Containers also require additional permissions to access GPUs. For each\n container that requests GPUs, add the following permissions to your\n container spec:\n\n securityContext:\n seLinuxOptions:\n type: unconfined_t\n\n2. Apply your container manifest file:\n\n kubectl apply -f ${CONTAINER_MANIFEST_FILE_PATH} \\\n -n sandbox-gpu-project \\\n --kubeconfig ${KUBECONFIG}"]]