[[["容易理解","easyToUnderstand","thumb-up"],["確實解決了我的問題","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["難以理解","hardToUnderstand","thumb-down"],["資訊或程式碼範例有誤","incorrectInformationOrSampleCode","thumb-down"],["缺少我需要的資訊/範例","missingTheInformationSamplesINeed","thumb-down"],["翻譯問題","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["上次更新時間:2025-09-03 (世界標準時間)。"],[[["\u003cp\u003eGoogle Cloud provides robust GPU infrastructure to support demanding AI, machine learning, scientific, and enterprise workloads, with optimized software and networking options.\u003c/p\u003e\n"],["\u003cp\u003eUsers can leverage GPU-accelerated virtual machines (VMs) via accelerator-optimized machine families like A3, A2, and G2, designed for maximum performance.\u003c/p\u003e\n"],["\u003cp\u003eVertex AI offers various ways to utilize GPU-enabled VMs to improve model training, deployment, and prediction latency, along with supporting open-source large language models.\u003c/p\u003e\n"],["\u003cp\u003eHypercompute Cluster enables the creation of clusters of GPU-accelerated VMs, managed as a single unit, suited for AI, ML, and high-performance computing (HPC) workloads.\u003c/p\u003e\n"],["\u003cp\u003eCompute Engine provides options to create individual VMs or small clusters with attached GPUs for graphics-intensive tasks and small-scale model training, while Cloud Run supports GPU configuration for AI inference workloads.\u003c/p\u003e\n"]]],[],null,["# About GPUs on Google Cloud\n\n*** ** * ** ***\n\nGoogle Cloud is focused on delivering world-class artificial intelligence (AI)\ninfrastructure to power your most demanding GPU-accelerated workloads across a\nwide range of segments. You can use GPUs on Google Cloud to run AI, machine\nlearning (ML), scientific, analytics, engineering, consumer, and enterprise\napplications.\n\nThrough our partnership with NVIDIA, Google Cloud delivers the latest GPUs while\noptimizing the software stack with a wide array of storage and networking\noptions. For a full list of GPUs available, see [GPU platforms](/compute/docs/gpus).\n\nThe following sections outline the benefits of GPUs on Google Cloud.\n\nGPU-accelerated VMs\n-------------------\n\nOn Google Cloud, you can access and provision GPUs in the way that best suits\nyour needs. A specialized [accelerator-optimized machine family](/compute/docs/accelerator-optimized-machines) is\navailable, with pre-attached GPUs and networking capabilities that are ideal for\nmaximizing performance. These are available in the A4X, A4, A3, A2, and G2\nmachine series.\n\nMultiple provisioning options\n-----------------------------\n\nYou can provision clusters by using the accelerator-optimized machine family\nwith any of the following open-source or Google Cloud products.\n\n### Vertex AI\n\nVertex AI is a fully-managed machine learning (ML) platform that you\ncan use to train and deploy ML models and AI applications. In Vertex AI\napplications, you can use GPU-accelerated VMs to improve performance in the\nfollowing ways:\n\n- [Use GPU-enabled VMs](/vertex-ai/docs/training/configure-compute) in custom training GKE worker pools.\n- Use [open source LLM models from the Vertex AI Model Garden](/vertex-ai/generative-ai/docs/open-models/use-open-models).\n- Reduce [prediction](/vertex-ai/docs/predictions/configure-compute#gpus) latency.\n- Improve performance of [Vertex AI Workbench](/vertex-ai/docs/workbench/instances/change-machine-type) notebook code.\n- Improve performance of a [Colab Enterprise runtime](/colab/docs/create-runtime-template).\n\n### Cluster Director\n\nCluster Director (formerly known as *Hypercompute Cluster* ) is a set of\nfeatures and services that are designed to let you deploy and manage large\nnumbers, up to tens of thousands, of accelerator and networking resources that\nfunction as a single homogeneous unit. This option is ideal for creating a\ndensely allocated, performance-optimized infrastructure that has integrations\nfor Google Kubernetes Engine (GKE) and Slurm schedulers. Cluster Director helps\nyou to build an infrastructure that is specifically designed for running AI, ML,\nand HPC workloads. For more information, see [Cluster Director](/ai-hypercomputer/docs/hypercompute-cluster).\n\nTo get started with Cluster Director, see [Choose a deployment strategy](/ai-hypercomputer/docs/choose-strategy).\n\n### Compute Engine\n\nYou can also create and manage individual VMs or small clusters of VMs with\nattached GPUs on Compute Engine. This method is mostly used for running\ngraphics-intensive workloads, simulation workloads, or small-scale ML model\ntraining.\n\nThe following table shows the methods that you can use to create VMs that have\nGPUs attached:\n\n### Cloud Run\n\nYou can configure GPUs for your Cloud Run instances. GPUs are ideal for\nrunning AI inference workloads using large language models on Cloud Run.\n\nOn Cloud Run, consult these resources for running AI workloads on GPUs:\n\n- [Configure GPUs for a Cloud Run service](/run/docs/configuring/services/gpu)\n- [Load large ML models on Cloud Run with GPUs](/run/docs/configuring/services/gpu-best-practices#model-loading-recommendations)\n- [Tutorial: Run LLM inference on Cloud Run GPUs with Ollama](/run/docs/tutorials/gpu-gemma2-with-ollama)"]]