Compute

Increase Compute Engine VM performance with custom queues

November 17, 2023

Matthew Dalida

Technical Account Manager, Google Cloud

Aniruddha Agharkar

Google Cloud Network Specialist

Leveraging Custom Queues per vNIC on Compute Engine

Whether you're running high-bandwidth applications, traffic-heavy AI/ML workloads, or network virtual appliances (NVAs), ensuring your VMs can handle the load is essential. However, with the default queue allocation, your VMs may not be reaching their full potential and meeting current demands. In a previous blog post, we shared how to increase Compute Engine VM bandwidth with Tier 1 networking on Google Cloud. Today we look at a new feature to maximize the network performance on Compute Engine VMs by assigning custom queues per virtual network interface card (vNIC).

The Challenge: Limited Queue Allocation

In a bustling city with high traffic demands, adding lanes to a busy road can help alleviate traffic congestion. In a similar way, queues on a vNIC can get congested. Previously, each vNIC on a Compute Engine VM was assigned a fixed number of network queues based on the default queue allocation. This meant that you were limited in the number of queues each vNIC could utilize, which could hinder performance for demanding workloads.

For instance, an n2-standard-128 VM with 8 vNICs would only yield 8 queues per vNIC (Tx/Rx), utilizing only half the maximum number of queues available on the VM. This forced you to make trade-offs, either vertically scaling up to a larger VM or reducing the number of vNICs to ensure each vNIC had sufficient queues. Both options led to suboptimal scaling and increased costs.

The Solution: Custom Queues per vNIC

Now, you can leverage custom queues per vNIC. This feature allows you to manually assign each vNIC up to the maximum number of network queues supported by the driver type. With virtIO, the maximum queue count is 32, while with gVNIC, it's 16.

Who Should Care?

This feature is particularly beneficial for those running:

High-bandwidth applications: If your applications demand high throughput, custom queues can significantly improve performance by allowing each vNIC to handle more traffic.
Traffic-heavy AI/ML workloads: AI/ML workloads often generate massive amounts of data, and custom queues can ensure your VMs can efficiently process and transfer this data without bottlenecks.
Network virtual appliances (NVAs): NVAs generally have multiple network interfaces, and custom queues can ensure each interface has the resources it needs to perform optimally.

Benefits of Custom Queues

Maximize network performance: By utilizing all available network queues, you can significantly improve the overall performance of your VMs.
Optimize resource utilization: Custom queues allow you to tailor your VM's network resources to your specific needs, ensuring efficient utilization and avoiding unnecessary costs.
Scale seamlessly: With custom queues, you can scale your VMs more effectively without compromising network performance.
Cost-effective solution: Custom queues are included with Tier 1 high-bandwidth networking and Gen 2 GCE VMs, so there's no additional cost to take advantage of this powerful feature.

Custom Queues and Oversubscription

As mentioned earlier, with custom queue allocation for gVNIC, you could allocate up to 16 queues per vNIC. The total number of queues on all of the vNICs cannot exceed the number of vCPUs on the VM.
However, if you are using N2, N2D, C2, or C2D VMs with gVNIC driver, and Tier_1 networking enabled, you can use queue oversubscription. This means that you can assign up to 16 network queues per vNIC such that the total number of queues on all of the vNICs can exceed the number of vCPUs.
Effectively with queue oversubscription, the total number of queues per VM is 16 times the number of vNICs.
We set up our benchmarking tests using a multi-nic VM configured as an NVA connecting 8 different VPC networks with a goal to measure the aggregate throughput delivered by the inline NVA.
Using the default queue allocation we were able to push an aggregate throughput of ~50% of the NVA VM’s stated throughput whereas with queue oversubscription this went up close to saturating ~99% of the NVA VM’s stated throughput. Of course throughput results will vary, so try it out yourself to see what is possible in your network.

Start using Custom Queues Today

Custom queues per virtual network interface card on Compute Engine VMs is a significant advancement that addresses the network performance needs of modern workloads. By enabling custom queue allocation, you can achieve significant performance improvements without incurring additional costs. This feature is particularly valuable for high-bandwidth applications, traffic-heavy AI/ML workloads, and organizations seeking to maximize network efficiency and cost savings. Get started today using gcloud, Terraform, or REST API.

Compute

Configuring your Compute Engine VMs with Tier_1 high-bandwidth networking

By Matthew Dalida • 2-minute read

https://storage.googleapis.com/gweb-cloudblog-publish/images/05_-_Compute.max-900x900.jpg

Posted in

Compute

Taming the stragglers: Maximize AI training performance with automated straggler detection

By Elias Ahmed • 5-minute read

Compute

Google is a Leader in the Gartner® Magic Quadrant for Strategic Cloud Platform Services

By Brad Calder • 6-minute read

AI & Machine Learning

Announcements for AI Hypercomputer: The latest infrastructure news for ML practitioners

By Chelsie Czop • 6-minute read

Compute

Now GA: C4 VMs with Local SSD, bare metal, and larger shapes, on Intel Xeon 6

By Olivia Melendez • 4-minute read

Increase Compute Engine VM performance with custom queues

Matthew Dalida

Aniruddha Agharkar

Configuring your Compute Engine VMs with Tier_1 high-bandwidth networking

Related articles

Taming the stragglers: Maximize AI training performance with automated straggler detection

Google is a Leader in the Gartner® Magic Quadrant for Strategic Cloud Platform Services

Announcements for AI Hypercomputer: The latest infrastructure news for ML practitioners

Now GA: C4 VMs with Local SSD, bare metal, and larger shapes, on Intel Xeon 6