Cost and performance optimizations for the E2 machine series


E2 VMs provide the value, performance, and flexibility to improve the vast majority of your workloads. E2 VMs use dynamic resource management to drive cost efficiency by making better use of the physical resources available on host machines. Additionally, E2 VMs use a custom-built CPU scheduler and performance-aware live migration to prioritize performance and protect your workloads from the type of issues associated with over-subscription. These are the same technologies that Google Search, Google Ads, Google Maps, and YouTube services use to run their latency-sensitive workloads efficiently.

How dynamic resource management works

Virtual CPUs (vCPUs) are implemented as threads that are scheduled to run on demand like any other thread on a host. When the vCPU has work to do, the work is assigned to an available physical CPU on which to run until it goes to sleep again. Similarly, virtual RAM is mapped to physical host pages via page tables that are populated when a guest-physical page is first accessed. This mapping remains fixed until the VM indicates that a guest-physical page is no longer needed.

Dynamic resource management enables Compute Engine to better utilize the available physical CPUs by scheduling VMs to servers based on resource demand and scheduling vCPU threads to physical CPUs such that wait time is minimized. In most cases, we are able to do this seamlessly. As a result, Google Cloud can run more VMs on fewer servers, allowing Compute Engine to offer E2 VMs at a significantly lower price than other VM types.

Components of dynamic resource management

Compute Engine uses the following technologies to implement dynamic resource management:

Larger, more efficient physical servers

Core count and RAM density have steadily increased such that now the host servers have far more resources than any individual E2 VM. Google continually benchmarks new hardware and looks for platforms that are cost-effective and perform well for the widest variety of cloud workloads and services. When hardware upgrades become available, Compute Engine live-migrates E2 VMs to newer and faster hardware, allowing you to automatically take advantage of these new resources.

Intelligent VM placement

Google's cluster management system observes the CPU, RAM, memory bandwidth, and other resource demands of VMs running on a physical server. It uses this information to predict how a newly added VM will perform on that server. It then searches across thousands of servers to find the best location to add a VM. These observations ensure that when a new VM is placed, it is compatible with its neighbors and unlikely to experience interference from those instances.

Performance-aware live migration

After VMs are placed on a host, Compute Engine continuously monitors VM performance and wait times. If the resource demands of the VMs increase, Compute Engine can use live migration to transparently shift E2 loads to other hosts in the data center. The live migration policy is guided by a predictive approach that gives Compute Engine time to shift the load, often before any wait time is experienced by the VMs.

Hypervisor CPU scheduler

The hypervisor CPU scheduler dynamically maps E2 virtual CPU and memory to the physical CPU and memory of the host server on demand. This dynamic management drives cost efficiency in E2 VMs by making better use of the physical resources. Efficient use of resources means Compute Engine can run more VMs on fewer servers, allowing Google Cloud to offer E2 VMs for significantly less than other VM types.

Virtio memory balloon device

Memory ballooning is an interface mechanism between host and guest to dynamically adjust the size of the reserved memory for the guest. A virtio memory balloon device is used to implement memory ballooning. Through the virtio memory balloon device, a host can explicitly ask a guest to yield a certain amount of free memory pages (also called memory balloon inflation), and reclaim the memory so that the host can use the free memory for other VMs. Likewise, the virtio memory balloon device can return memory pages back to the guest by deflating the memory balloon.

Virtio memory device

Compute Engine E2 VM instances that are based on a public image have a virtio memory balloon device , which monitors the guest operating system's memory use. The guest operating system communicates its available memory to the host system. The host reallocates any unused memory to other processes on demand, thereby using memory more effectively. Compute Engine collects and uses this data to make more accurate rightsizing recommendations.

Verifying the driver installation

To check if your image has the virtio memory balloon device driver installed and loaded, run the following command.

Linux

Most Linux distributions include the virtio memory balloon device driver. To verify that your image has the driver installed and loaded, run:

sudo modinfo virtio_balloon > /dev/null && echo Balloon driver is \
installed || echo Balloon driver is not installed; sudo lsmod | grep \
virtio_balloon > /dev/null && echo Balloon driver is loaded || echo \
Balloon driver is not loaded

In Linux kernels before 5.2, the Linux memory system sometimes mistakenly prevents large allocations when the balloon device is present. This is rarely an issue in practice, but we recommend changing the virtual memory overcommit_memory setting to 1 to prevent the issue from occurring. This change is already made by default in all Google-provided images published since February 9, 2021.

To fix the setting, use the following command to change the value from 0 to 1:

sudo /sbin/sysctl -w vm.overcommit_memory=1

To persist this change across reboots, add the following to your /etc/sysctl.conf file:

vm.overcommit_memory=1

Windows

Compute Engine Windows images include the virtio balloon device. However, custom Windows images do not. To verify whether your Windows image has the driver installed, run:

googet verify google-compute-engine-driver-balloon

Disabling the virtio memory balloon device

Using the virtio memory balloon device enables Compute Engine to utilize memory resources more effectively so Google Cloud can offer E2 VMs at lower prices. You can opt out of the virtio memory balloon device by disabling the device driver. After disabling the virtio memory balloon device, you will continue to receive rightsizing recommendations; however, they might not be as accurate.

Linux

To disable the device in Linux, run the following command:

sudo rmmod virtio_balloon

You can add this command to the VM's startup script to automatically disable the device upon VM boot.

Windows

To disable the device on Windows, run the following command:

googet -noconfirm remove google-compute-engine-driver-balloon

You can put this command into the VM's startup script to automatically disable the device upon VM boot.

What's next