Nesta página, mostramos como aumentar a largura de banda de rede para nós de GPU em clusters do Google Kubernetes Engine (GKE) usando a NIC virtual do Google (gVNIC).
Nos clusters do Autopilot, os nós que executam o GKE 1.30.2-gke.1023000 e versões
mais recentes são fornecidos automaticamente com a placa de rede virtual do Google (gVNIC) instalada.
As instruções nesta página se aplicam apenas aos clusters Standard.
Antes de começar
Antes de começar, verifique se você realizou as tarefas a seguir:
Se você quiser usar a Google Cloud CLI para essa tarefa,
instale e, em seguida,
inicialize a
CLI gcloud. Se você instalou a CLI gcloud anteriormente, instale a versão
mais recente executando gcloud components update.
A API gVNIC só é compatível com nós de GPU.
Para aumentar a largura de banda nos nós que não são de GPU, considere ativar a largura de banda de nível 1.
Requisitos
Os nós do GKE precisam usar uma imagem de nó do Container-Optimized OS.
Ativar gVNIC
É possível criar um cluster contendo pools de nós que usam gVNIC, criar um pool de nós
com gVNIC ativada ou atualizar um pool de nós para usar gVNIC.
Criar um cluster
Crie um cluster contendo pools de nós que usam gVNIC:
[[["Fácil de entender","easyToUnderstand","thumb-up"],["Meu problema foi resolvido","solvedMyProblem","thumb-up"],["Outro","otherUp","thumb-up"]],[["Difícil de entender","hardToUnderstand","thumb-down"],["Informações incorretas ou exemplo de código","incorrectInformationOrSampleCode","thumb-down"],["Não contém as informações/amostras de que eu preciso","missingTheInformationSamplesINeed","thumb-down"],["Problema na tradução","translationIssue","thumb-down"],["Outro","otherDown","thumb-down"]],["Última atualização 2024-11-26 UTC."],[],[],null,["# Increase network traffic speed for GPU nodes\n\n[Autopilot](/kubernetes-engine/docs/concepts/autopilot-overview) [Standard](/kubernetes-engine/docs/concepts/choose-cluster-mode)\n\n*** ** * ** ***\n\nThis page shows you how to increase network bandwidth for GPU nodes on\nGoogle Kubernetes Engine (GKE) clusters by using\n[Google Virtual NIC (gVNIC)](/compute/docs/networking/using-gvnic).\n\nIn Autopilot clusters, nodes that run GKE version 1.30.2-gke.1023000\nand later have Google Virtual NIC (gVNIC) installed automatically.\nThe instructions on this page only apply to Standard clusters.\n\nTo increase bandwidth on CPU nodes, consider enabling\n[Tier-1 bandwidth](/kubernetes-engine/docs/how-to/using-tier-1#enable_tier_1_bandwidth).\n\nBefore you begin\n----------------\n\nBefore you start, make sure that you have performed the following tasks:\n\n- Enable the Google Kubernetes Engine API.\n[Enable Google Kubernetes Engine API](https://console.cloud.google.com/flows/enableapi?apiid=container.googleapis.com)\n- If you want to use the Google Cloud CLI for this task, [install](/sdk/docs/install) and then [initialize](/sdk/docs/initializing) the gcloud CLI. If you previously installed the gcloud CLI, get the latest version by running `gcloud components update`. **Note:** For existing gcloud CLI installations, make sure to set the `compute/region` [property](/sdk/docs/properties#setting_properties). If you use primarily zonal clusters, set the `compute/zone` instead. By setting a default location, you can avoid errors in the gcloud CLI like the following: `One of [--zone, --region] must be supplied: Please specify location`. You might need to specify the location in certain commands if the location of your cluster differs from the default that you set.\n\n### Limitations\n\n- [Compute Engine](/compute/docs/networking/using-gvnic#limitations) limitations apply.\n\n### Requirements\n\n- GKE nodes must use a Container-Optimized OS [node image](/kubernetes-engine/docs/concepts/node-images#cos).\n\nEnable gVNIC\n------------\n\nYou can create a cluster that has node pools that use gVNIC, create a node pool\nwith gVNIC enabled, or update a node pool to use gVNIC.\n\n### Create a cluster\n\nCreate a cluster with node pools that use gVNIC: \n\n gcloud container clusters create \u003cvar translate=\"no\"\u003eCLUSTER_NAME\u003c/var\u003e \\\n --accelerator type=\u003cvar translate=\"no\"\u003eGPU_TYPE\u003c/var\u003e,count=\u003cvar translate=\"no\"\u003eAMOUNT\u003c/var\u003e \\\n --machine-type=\u003cvar translate=\"no\"\u003eMACHINE_TYPE\u003c/var\u003e \\\n --enable-gvnic\n\nReplace the following:\n\n- \u003cvar translate=\"no\"\u003eCLUSTER_NAME\u003c/var\u003e: the name of the new cluster.\n- \u003cvar translate=\"no\"\u003eGPU_TYPE\u003c/var\u003e: the type of [GPU accelerator](/compute/docs/gpus) that you use. For example, `nvidia-tesla-t4`.\n- \u003cvar translate=\"no\"\u003eAMOUNT\u003c/var\u003e: the number of GPUs to attach to nodes in the node pool.\n- \u003cvar translate=\"no\"\u003eMACHINE_TYPE\u003c/var\u003e: the type of machine you want to use. gVNIC is not supported on [memory-optimized machine types](/compute/docs/machine-types#memory-optimized_machine_type_family).\n\n### Create a node pool\n\nCreate a node pool that uses gVNIC: \n\n gcloud container node-pools create \u003cvar translate=\"no\"\u003eNODEPOOL_NAME\u003c/var\u003e \\\n --cluster=\u003cvar translate=\"no\"\u003eCLUSTER_NAME\u003c/var\u003e \\\n --enable-gvnic\n\nReplace the following:\n\n- \u003cvar translate=\"no\"\u003eNODEPOOL_NAME\u003c/var\u003e: the name of a new node pool.\n- \u003cvar translate=\"no\"\u003eCLUSTER_NAME\u003c/var\u003e: the name of the existing cluster.\n\n### Update a node pool\n\nUpdate a node pool to use gVNIC: \n\n gcloud container node-pools update \u003cvar translate=\"no\"\u003eNODEPOOL_NAME\u003c/var\u003e \\\n --cluster=\u003cvar translate=\"no\"\u003eCLUSTER_NAME\u003c/var\u003e \\\n --enable-gvnic\n\nReplace the following:\n\n- \u003cvar translate=\"no\"\u003eNODEPOOL_NAME\u003c/var\u003e: the name of the node pool that you want to update.\n- \u003cvar translate=\"no\"\u003eCLUSTER_NAME\u003c/var\u003e: the name of the existing cluster.\n\nThis change requires recreating the nodes, which can cause disruption to your\nrunning workloads. For details about this specific change, find the\ncorresponding row in the [manual changes that recreate the nodes using a node\nupgrade strategy without respecting maintenance\npolicies](/kubernetes-engine/docs/concepts/managing-clusters#manual-changes-strategy-but-no-respect-policies)\ntable. To learn more about node updates, see [Planning for node update\ndisruptions](/kubernetes-engine/docs/concepts/managing-clusters#plan-node-disruption).\n| **Caution:** GKE immediately begins recreating the nodes for this change using the node upgrade strategy, regardless of active maintenance policies. GKE depends on [resource\n| availability](/kubernetes-engine/docs/how-to/node-upgrades-quota) for the change. Disabling node auto-upgrades [doesn't prevent this\n| change](/kubernetes-engine/docs/how-to/node-auto-upgrades#disable-only-versions). Ensure that your workloads running on the nodes are prepared for disruption before you initiate this change.\n\nDisable gVNIC\n-------------\n\nUpdate the node pool using the `--no-enable-gvnic` flag: \n\n gcloud container node-pools update \u003cvar translate=\"no\"\u003eNODEPOOL_NAME\u003c/var\u003e \\\n --cluster=\u003cvar translate=\"no\"\u003eCLUSTER_NAME\u003c/var\u003e \\\n --no-enable-gvnic\n\nThis change requires recreating the nodes, which can cause disruption to your\nrunning workloads. For details about this specific change, find the\ncorresponding row in the [manual changes that recreate the nodes using a node\nupgrade strategy without respecting maintenance\npolicies](/kubernetes-engine/docs/concepts/managing-clusters#manual-changes-strategy-but-no-respect-policies)\ntable. To learn more about node updates, see [Planning for node update\ndisruptions](/kubernetes-engine/docs/concepts/managing-clusters#plan-node-disruption).\n| **Caution:** GKE immediately begins recreating the nodes for this change using the node upgrade strategy, regardless of active maintenance policies. GKE depends on [resource\n| availability](/kubernetes-engine/docs/how-to/node-upgrades-quota) for the change. Disabling node auto-upgrades [doesn't prevent this\n| change](/kubernetes-engine/docs/how-to/node-auto-upgrades#disable-only-versions). Ensure that your workloads running on the nodes are prepared for disruption before you initiate this change.\n\nTroubleshooting\n---------------\n\nTo troubleshoot gVNIC, see\n[Troubleshooting Google Virtual NIC](/compute/docs/troubleshooting/gvnic).\n\nWhat's next\n-----------\n\n- Use [network policy logging](/kubernetes-engine/docs/how-to/network-policy-logging) to record when connections to Pods are allowed or denied by your cluster's [network policies](https://kubernetes.io/docs/concepts/services-networking/network-policies/)."]]