AI & Machine Learning

Cheaper Cloud AI deployments with NVIDIA T4 GPU price cut

January 23, 2020

Chris Kleban

Group Product Manager, Google Cloud

Adam Kerin

Product Marketing

Google Cloud offers a wide range of GPUs to accelerate everything from AI deployment to 3D visualization. These use cases are now even more affordable with the price reduction of the NVIDIA T4 GPU. As of early January, we’ve reduced T4 prices by more than 60%, making it the lowest cost GPU instance on Google Cloud.

https://storage.googleapis.com/gweb-cloudblog-publish/images/Hourly_Pricing_Per_T4_GPU_3.max-1500x1500.png

Prices above are for us-central1 and vary by region. A full GPU pricing table is here.

Locations and configurations

Google Cloud was the first major cloud provider to launch the T4 GPU and offer it globally (in eight regions). This worldwide footprint, combined with the performance of the T4 Tensor Cores, opens up more possibilities to our customers. Since our global rollout, T4 performance has improved. The T4 and V100 GPUs now boast networking speeds of up to 100 Gbps, in beta, with additional regions coming online in the future.

These GPU instances are also flexible to suit different workloads. The T4 GPUs can be attached to our n1 machine types that support custom VM shapes. This means you can create a VM tailored specifically to meet your needs, whether it’s a low cost option like one vCPU, one GB memory, and one T4 GPU, or as high performance as 96 vCPUs, 624 GB memory, and four T4 GPUs—and most anything in between. This is helpful for machine learning (ML), since you may want to adjust your vCPU count based on your pre-processing needs. For visualization, you can create VM shapes for lower end solutions all the way up to powerful, cloud-based professional workstations.

Machine Learning

With mixed precision support and 16 GB of memory, the T4 is also a great option for ML workloads. For example, Compute Engine preemptible VMs work well for batch ML inference workloads, offering lower cost compute in exchange for variable capacity availability. We previously shared sample T4 GPU performance numbers for ML inference of up to 4,267 images-per-second (ResNet 50, batch size 128, precision INT8). That means you can perform roughly 15 million image predictions in an hour for a $0.11 add-on cost for a single T4 GPU with your n1 VM.

Google Cloud offers several options to access these GPUs. One of the simplest ways to get started is through Deep Learning VM Images for AI Platform and Compute Engine, and Deep Learning Containers for Google Kubernetes Engine (GKE). These are configured for software compatibility and performance, and come pre-packaged with your favorite ML frameworks, including PyTorch and TensorFlow Enterprise.

We’re committed to making GPU acceleration more accessible, whatever your budget and performance requirements may be. With the reduced cost of NVIDIA T4 instances, we now have a broad selection of accelerators for a multitude of workloads, performance levels, and price points. Check out the full pricing table and regional availability and try the NVIDIA T4 GPU for your workload today.

Posted in