Exafunction
Efficient deep learning at scale
Exafunction is a GPU acceleration platform that lets users register arbitrary GPU computations and deep learning models and run them at any scale, both in cloud and on premises. For many deep learning workloads, GPUs are underutilized and left idle, either due to CPU computation or I/O bottlenecks. Exafunction manages clusters of GPUs and allows applications to run entirely on CPU, offloading GPU computations and colocating them to ensure GPUs are fully utilized. The end result is Exafunction enables workloads to run much faster over the same number of GPUs, which has especially been important recently since GPUs are both scarce and expensive. The system has been deployed at the largest autonomous vehicle companies for their simulation workloads, where it is run concurrently over 1000s of GPUs, reducing GPU usage and accelerating workloads by over 5x.