Jump to

AI Infrastructure

Scalable, high performance, and cost effective infrastructure for every ML workload.

  • AI Accelerators for every use case from high performance training to low cost inference

  • Scale faster with Vertex AI’s fully managed infrastructure, purpose built for AI workloads 

  • Groundbreaking algorithms with optimized infrastructure, built by Google Research and partners


Optimize performance and cost, at scale

With Google Cloud, you can choose from GPUs, TPUs, or CPUs to support a variety of use cases including high performance training, low cost inference, and large scale data processing.

Deliver results faster with managed infrastructure

Scale faster and more efficiently with managed infrastructure provided by Vertex AI. Set up ML environments quickly, automate orchestration, manage large clusters, and set up low latency applications.

Innovate faster with State of the Art AI

Drive more value from ML with access to state of the art AI from Google Research, DeepMind, and partners.

Key features

Key features

Flexible and scalable hardware for any use case

With Google Cloud, you can choose from GPUs, TPUs, or CPUs to support a variety of use cases including high performance training, low cost inference, and large scale data processing. Move faster with Tensor Processing Units (TPUs) for training and executing deep neural networks at scale with optimized cost and training time.

You can choose from a range of NVIDIA GPUs to help with cost-effective inference or scale-up or scale-out training. Not all machine learning models are the same, and different models benefit from different levels of hardware acceleration. Finally, access CPU platforms when you start a VM instance on Compute Engine. Compute Engine offers a range of both Intel and AMD processors for your VMs.

Low-latency serving

Vertex AI provides purpose-built infrastructure needed to automatically manage ML processes. Deploy to fully managed endpoints with autoscaling, private endpoints, and wide selection of CPUs and GPUs.

Optimized Tensorflow Runtime enables model precompliation across GPUs and CPUs with up to 8x throughput and 6x lower latency for tabular models.

Large scale training

Vertex AI provides managed networking capabilities to help customers scale multi-node training and reduce training time.

Managed Training Jobs: Submit and forget training jobs with queue management, NVIDIA GPUs and TPUs, and built-in hyperparameter optimization.

Reduction Server: Optimize distributed GPU training for synchronous data parallel algorithms with up to 30%-40% reduction in training time and cost.

Cloud Storage FUSE and NFS: Simplify and accelerate your ML training jobs with built-in File and Object Storage options within Vertex AI with support for Cloud Storage AutoClass.

State-of-the-Art AI algorithms

Access state-of-the art AI algorithms developed by Google research to streamline complex AI use cases with optimized infrastructure built-in. Reduce complexity and speed up time to value with algorithms like NAS, TabNet, Alphafold, and NVIDIA Merlin.

Highly flexible and open platform

We are committed to giving customers the ultimate freedom to pick the ML framework or infrastructure services that work best for them. 

Easily access and use any tool, API, or framework for your ML workloads from a single unified data and AI cloud platform, allowing your teams to pick the right framework to match their preference and development efficiency.

Access a rich set of building blocks such as Deep Learning VMs and containers, and a marketplace of curated ISV offerings to help architect your own custom software stack on VMs and/or Google Kubernetes Engine (GKE). 



Google Cloud Basics
Using GPUs for training models in the cloud

GPUs can accelerate the training process for deep learning models for tasks like image classification, video analysis, and natural language processing.

Google Cloud Basics
Using TPUs to train your model

TPUs are Google's custom-developed ASICs used to accelerate machine learning workloads. You can run your training jobs on AI Platform Training, using Cloud TPU.

What makes TPUs fine tuned for deep learning?

Learn about the computational requirements of deep learning and how CPU, GPU, and TPUs handle the task.

Google Cloud Basics
Deep Learning VM

Deep Learning VM images are optimized for data science and machine learning tasks. They come with key ML frameworks and tools pre-installed, and work with GPUs.

Google Cloud Basics
AI Platform Deep Learning Containers

AI Platform Deep Learning Containers are performance-optimized, consistent environments to help you prototype and implement workflows quickly. They work with GPUs.



Pricing for AI Infrastructure is based on the product selected. You can try AI Infrastructure for free.

Cloud TPU Cloud GPU
For information on TPU pricing for single device TPU types and TPU pod types, refer to TPU pricing. For information about GPU pricing for the different GPU types and regions that are available on Compute Engine, refer to the GPU pricing.