N4 VMs, powered by 5th generation Intel Xeon processors and Titanium, use next generation dynamic resource management to drive cost efficiency by making better use of the physical resources available on host machines, and also uses a custom-built CPU scheduler and performance-aware live migration to balance workload performance needs with available resources. These are the same technologies that Google Search, Google Ads, Google Maps, and YouTube services use to run their latency-sensitive workloads efficiently.
Next generation dynamic resource management also has better NUMA affinity, more accurate prediction of resource requirements, and faster rebalancing using performance-aware live migration.
How dynamic resource management works
Virtual CPUs (vCPUs) are implemented as threads that are scheduled to run on demand like any other thread on a host. When the vCPU has work to do, the work is assigned to an available physical CPU on which to run until it goes to sleep again. Similarly, virtual RAM is mapped to physical host pages using page tables that are populated when a guest-physical page is first accessed. This mapping remains fixed until the VM indicates that a guest-physical page is no longer needed.
Dynamic resource management enables Compute Engine to better use the available physical CPUs by scheduling VMs to servers based on resource demand, and scheduling vCPU threads to physical CPUs such that wait time is minimized. In most cases, we can do this seamlessly, so Google Cloud can run VMs more efficiently on fewer servers.
Components of dynamic resource management
Compute Engine uses the following technologies for dynamic resource management:
Larger, more efficient physical servers
Core count and RAM density have steadily increased such that now the host servers have far more resources than any individual VM. Google continually benchmarks new hardware and looks for platforms that are cost-effective and perform well for the widest variety of cloud workloads and services, allowing you to take advantage of the newest technologies when they're available.
Intelligent VM placement
Google's cluster management system observes the CPU, RAM, memory bandwidth, and other resource demands of VMs running on a physical server. It uses this information to predict how a newly added VM will perform on that server. It then searches across thousands of servers to find the best location to add a VM. These observations ensure that when a new VM is placed, it is compatible with its neighbors and unlikely to experience interference from those instances.
Performance-aware live migration
After VMs are placed on a host, Compute Engine continuously monitors VM performance and wait times. If the resource demands of the VMs increase, Compute Engine can use live migration to transparently shift workloads to other hosts in the data center. The live migration policy is guided by a predictive approach that gives Compute Engine time to shift the load, often before any wait time is experienced by the VMs.
Hypervisor CPU scheduler
The hypervisor CPU scheduler dynamically maps virtual CPU and memory to the physical CPU and memory of the host server on demand. This dynamic management drives cost efficiency in VMs by making better use of the physical resources. Efficient use of resources means Compute Engine can run VMs more efficiently on fewer servers, allowing Google Cloud to pass on savings to users.
First generation dynamic resource management
E2 was the first VM series to offer dynamic resource management using a virtio memory balloon device.
Virtio memory balloon device with E2 VMs
Memory ballooning is an interface mechanism between host and guest to dynamically adjust the size of the reserved memory for the guest. E2 uses a virtio memory balloon device to implement memory ballooning. Through the virtio memory balloon device, a host can explicitly ask a guest to yield a certain amount of free memory pages (also called memory balloon inflation), and reclaim the memory so that the host can use the free memory for other VMs. Likewise, the virtio memory balloon device can return memory pages back to the guest by deflating the memory balloon. E2 VMs are the only machine family that uses the memory balloon device.
Compute Engine E2 VM instances that are based on a public image have a virtio memory balloon device , which monitors the guest operating system's memory use. The guest operating system communicates its available memory to the host system. The host reallocates any unused memory to other processes on demand, thereby using memory more effectively. Compute Engine collects and uses this data to make more accurate rightsizing recommendations.
Verifying the driver installation
To check if your image has the virtio memory balloon device driver installed and loaded, run the following command.
Linux
Most Linux distributions include the virtio memory balloon device driver. To verify that your image has the driver installed and loaded, run:
sudo modinfo virtio_balloon > /dev/null && echo Balloon driver is \ installed || echo Balloon driver is not installed; sudo lsmod | grep \ virtio_balloon > /dev/null && echo Balloon driver is loaded || echo \ Balloon driver is not loaded
In Linux kernels
before 5.2, the Linux memory system sometimes mistakenly
prevents large allocations when the balloon device is present. This is
rarely an issue in practice, but we recommend changing the
virtual memory overcommit_memory
setting to 1
to prevent the issue
from occurring. This change is already made by default in all
Google-provided images published since February 9, 2021.
To fix the setting, use the following command to change the value from
0
to 1
:
sudo /sbin/sysctl -w vm.overcommit_memory=1
To persist this change across reboots, add the following to your
/etc/sysctl.conf
file:
vm.overcommit_memory=1
Windows
Compute Engine Windows images include the virtio balloon device. However, custom Windows images don't. To verify whether your Windows image has the driver installed, run:
googet verify google-compute-engine-driver-balloon
Disabling the virtio memory balloon device
Using the virtio memory balloon device enables Compute Engine to utilize memory resources more effectively so Google Cloud can offer E2 VMs at lower prices. You can opt out of the virtio memory balloon device by disabling the device driver. After disabling the virtio memory balloon device, you will continue to receive rightsizing recommendations; however, they might not be as accurate.
Linux
To disable the device in Linux, run the following command:
sudo rmmod virtio_balloon
You can add this command to the VM's startup script to automatically disable the device upon VM boot.
Windows
To disable the device on Windows, run the following command:
googet -noconfirm remove google-compute-engine-driver-balloon
You can put this command into the VM's startup script to automatically disable the device upon VM boot.
What's next
- Read the blog about Dynamic resource management.
- Review the N4 machine series information.
- Review the E2 machine series information.
- Create a VM.