Intel Performance Libraries and Python Distribution enhance performance and scaling of Intel® Xeon® Scalable (‘Skylake’) processors on GCP
Google was pleased to be the first cloud vendor to offer the latest-generation Intel® Xeon® Scalable (‘Skylake’) processors in February 2017. With their higher core counts, improved on-chip interconnect with the new Intel® Mesh Architecture, enhanced memory subsystems and Intel® Advanced Vector Extensions-512 (AVX-512) functional units, these processors are a great fit for demanding HPC applications that need high floating-point operation rates (FLOPS) and the operand bandwidth to feed the processing pipelines.
Skylake raises the performance bar significantly, but a processor is only as powerful as the software that runs on it. So today we're announcing that the Intel Performance Libraries are now freely available for Google Cloud Platform (GCP) Compute Engine. These libraries, which include the Intel® Math Kernel Library, Intel® Data Analytics Acceleration Library, Intel® Performance Primitives, Intel® Threading Building Blocks, and Intel® MPI Library, integrate key communication and computation kernels that have been tuned and optimized for this latest Intel processor family, in terms of both sequential pipeline flow and parallel execution. These components are useful across all the Intel Xeon processor families in GCP, but they're of particular interest for applications that can use them to fully exploit the scale of 96 vCPU instances on Skylake-based servers.
Scaling out to Skylake can result in dramatic performance improvements. This parallel SGEMM matrix multiplication benchmark result, run by Intel engineers on GCP, shows the advantage obtained by going from a 64 vCPU GCP instance on an Intel® Xeon processor E5 (“Broadwell”) system to an instance with 96 vCPUs on Intel Xeon Scalable (“Skylake”) processors, using the Intel® MKL on GCP. Using half or fewer of the available vCPUs reduces hyper-thread sharing of AVX-512 functional units and leads to higher efficiency.
In addition to pre-compiled performance libraries, GCP users now have free access to the Intel® Distribution for Python, a distribution of both python2 and python3, which uses the Intel instruction features and pipelines for maximum effect.
The following chart shows example performance improvements delivered by the optimized scikit-learn K-means functions in the Intel® Distribution for Python over the stock open source Python distribution.
We’re delighted that Google Cloud Platform users will experience the best of Intel® Xeon® Scalable processors using the Intel® Distribution for Python and the Intel performance libraries Intel® MKL, Intel® DAAL, Intel® TBB, Intel® IPP and Intel® MPI. These software tools are carefully tuned to deliver the workload-optimized performance benefits of the advanced processors that Google has deployed, including 96 vCPUs and workload-optimized vector capabilities provided by Intel® AVX-512.