Introducing PyTorch across Google Cloud
Director of Product Management, Cloud AI
In conjunction with today’s release of PyTorch 1.0 Preview, we are broadening support for PyTorch throughout Google Cloud’s AI platforms and services. PyTorch is a deep learning framework designed for easy and flexible experimentation. With the release of PyTorch 1.0 Preview, the framework now supports a fully hybrid Python and C/C++ front-end as well as fast, native distributed execution for production environments.
Here at Google Cloud, we aim to support the full spectrum of machine learning (ML) practitioners, ranging from students and entrepreneurs who are just getting started to the world’s top research and engineering teams. ML developers use many different tools, and we’ve integrated several of the most popular open source frameworks into our products and services, including TensorFlow, PyTorch, scikit-learn, and XGBoost.
Deep Learning VM Images
Google Cloud Platform provides a set of virtual machine (VM) images that include everything you need to get started with various deep learning frameworks. We have provided a community-focused PyTorch VM image for a while, but today, we are especially excited to share a new VM image that contains PyTorch 1.0 Preview. This is the fastest way for you to try out the latest PyTorch release easily and efficiently: we’ve set up NVIDIA drivers and even pre-installed Jupyter Lab with sample PyTorch tutorials.
Kubeflow is an open source platform designed to make end-to-end ML pipelines easy to deploy and manage. Kubeflow already supports PyTorch, and the Kubeflow community has already developed a PyTorch package that can be installed in a Kubeflow deployment with just two commands. Additionally, in collaboration with NVIDIA, we have extended the TensorRT package in Kubeflow to support serving PyTorch models. We aim for Kubeflow to be the easiest way to build portable, scalable and composable PyTorch pipelines that run everywhere.
We’ve heard repeatedly from PyTorch users that they would appreciate a deeper integration with TensorBoard, a popular suite of machine learning visualization tools. We think this is a great idea, and the TensorBoard and PyTorch developers are now collaborating to make it simpler to use TensorBoard to monitor PyTorch training.
PyTorch on Cloud TPUs
Much of the tremendous progress in machine learning over the past several years has been driven by dramatic increases in the amount of computing power that can be harnessed to train and run ML models. This sea change has motivated Google to develop three generations of custom ASICs called “Tensor Processing Units,” or TPUs, that are specialized for machine learning. We’ve brought the second and third generations of these chips to Google Cloud as Cloud TPUs, and many PyTorch users have expressed interest in accelerating their ML workloads with Cloud TPUs.
Today, we’re pleased to announce that engineers on Google’s TPU team are actively collaborating with core PyTorch developers to connect PyTorch to Cloud TPUs. The long-term goal is to enable everyone to enjoy the simplicity and flexibility of PyTorch while benefiting from the performance, scalability, and cost-efficiency of Cloud TPUs.
As a starting point, the engineers involved have produced a prototype that connects PyTorch to Cloud TPUs via XLA, an open source linear algebra compiler. This prototype has successfully enabled us to train a PyTorch implementation of ResNet-50 on a Cloud TPU, and we’re planning to open source the prototype and then expand it in collaboration with the PyTorch community. Please email us at email@example.com to tell us what types of PyTorch workloads you would be most interested in accelerating with Cloud TPUs!
1.0 is just the beginningWe’d like to congratulate the PyTorch team on reaching their 1.0 milestone and officially welcome the PyTorch community to Google Cloud Platform. As you explore all of the offerings above, we’d love to hear your feedback and feature requests. You can contact us @GCPcloud.