Compute

Deploy GPU workloads across all your clouds with Anthos and NVIDIA

August 17, 2020

https://storage.googleapis.com/gweb-cloudblog-publish/images/google_cloud_nvidia.max-2600x2600.jpg

Amr Abdelrazik

Product Manager, Anthos

We are very excited to announce a joint solution with NVIDIA now publicly available to all users in beta that allows customers to run NVIDIA GPU workloads on Anthos across hybrid cloud and on-premises environments.

Running GPU workloads between clouds

Machine learning is one of the fastest growing application segments in the market today, powering many industries such as biotech, retail, manufacturing and many more.

With such unprecedented growth, customers are facing multiple challenges. The first is the difficult choice of where to run your ML and HPC workloads. While the cloud offers elasticity and flexibility for ML workloads, some applications have latency, data size, or even regulatory requirements that mean they need to reside within certain data centers and at edge locations.

The other challenge is high demand for on-prem GPU resources; no matter how fast organizations onboard GPU hardware, demand is always greater than supply, so you need to always maximize investment in your GPUs.

Organizations are also looking for a hybrid architecture that maximizes both cloud and on-prem resources. In this architecture, bursty, and transient model development and training can run in the cloud, while inference and steady state runtime can be on-prem, or vice versa.

Anthos and ML workloads

Anthos was built to enable customers to easily run applications both in the cloud and on-prem. Built on Kubernetes, Anthos’ advanced cluster management and multi-tenancy capabilities allows you to share your ML infrastructure across teams, increasing utilization and reducing the overhead of managing bespoke environments.

Anthos also allows you to run applications anywhere, whether they reside on-prem, other cloud providers, or even at the edge. The flexibility of deployment options with Anthos combined with open-source ML frameworks such as Tensorflow and Kubeflow lets you build truly cloud-portable ML solutions and applications.

In addition to in-house developed applications, you can use Anthos to deploy Google Cloud’s best-in-class ML services such as Vision AI, Document AI, and many others in your data center and at edge locations, turbocharging ML initiatives in your organizations.

Our collaboration with NVIDIA

For this solution, we’ve built on our strong relationship with NVIDIA, a leader in AI/ML acceleration. The solution uses the NVIDIA GPU Operator to deploy GPU drivers and software components required to enable GPUs in Kubernetes. The solution works with many popular NVIDIA data center GPUs such as the V100 and T4. This broad support allows you to take advantage of your existing and future investments in NVIDIA GPUs with Anthos. For more information about supported NVIDIA platforms, please check the NVIDIA GPU Operator documentation. You can also learn more about other Google Cloud and NVIDIA collaborations.

Getting started

This solution is available as beta and will work with Anthos on-prem 1.4 or later. For instructions on getting started using NVIDIA GPUs with Google Cloud’s Anthos and supported NVIDIA GPUs, please refer to the documentation here.

Posted in

Cost Management

Simpler billing, clearer savings: A FinOps guide to updated spend-based CUDs

By Alfonso Hernandez • 5-minute read

Serverless

High-performance inference meets serverless compute with NVIDIA RTX PRO 6000 on Cloud Run

By James Ma • 3-minute read

Compute

Unlock 2x better price-performance with Axion-based N4A VMs, now generally available

By Nate Baum • 6-minute read

Compute

Scaling WideEP Mixture-of-Experts inference with Google Cloud A4X (GB200) and NVIDIA Dynamo

By Sean Horgan • 9-minute read

Deploy GPU workloads across all your clouds with Anthos and NVIDIA

Amr Abdelrazik

Running GPU workloads between clouds

Anthos and ML workloads

Our collaboration with NVIDIA

Getting started

Related articles

Simpler billing, clearer savings: A FinOps guide to updated spend-based CUDs

High-performance inference meets serverless compute with NVIDIA RTX PRO 6000 on Cloud Run

Unlock 2x better price-performance with Axion-based N4A VMs, now generally available

Scaling WideEP Mixture-of-Experts inference with Google Cloud A4X (GB200) and NVIDIA Dynamo