As part of a set of technologies that contribute to a machine learning solution, Cloud Machine Learning Engine requires a development environment with carefully configured prerequisites and dependencies. This page describes the pieces that make up your development environment and the issues that go with them.
Python version support
Cloud ML Engine runs Python 2.7 by default.
Python 3.5 is available with Cloud ML Engine runtime version 1.4 and greater. You can set the Python version for your training job in a configuration file or with gcloud commands.
Online and batch prediction work with trained models, regardless of whether they were trained using Python 2 or Python 3.
If you are configuring your base development environment, you may need to use
sudo to run your
pip installation on macOS or Linux. However, if you use a
virtual environment, you won't need root access, because installation happens
outside of OS-protected system directories.
The configuration of the virtual machines that run GCP project in the cloud is defined by the runtime version that you use.
Python virtual environments
Python configuration can be complicated, especially if you develop other Python applications using different technologies on the same computer. You can simplify your package and version management by using a virtual environment to do your Python development.
A Python virtual environment manages a Python interpreter and packages that are isolated from your computer's default environment and dedicated to your project. You can use virtual environments to configure separate environments for each Python project you work on, each with its own version of Python and the modules you need.
There are several options for virtual Python environments. We recommend Anaconda (or its smaller version Miniconda). These include their own virtual environment manager called Conda. Anaconda is a popular suite of packages and tools that is commonly used by data scientists.
Machine learning frameworks
Cloud ML Engine supports the following frameworks:
- TensorFlow for training, online prediction, and batch prediction. See the guide to getting started with Tensorflow on Cloud ML Engine.
- scikit-learn for online prediction. See the quickstart for scikit-learn and XGBoost on Cloud ML Engine.
- XGBoost for online prediction. See the quickstart for scikit-learn and XGBoost on Cloud ML Engine.
Google Cloud Platform account
You must have a GCP account with billing enabled and a project with Cloud Machine Learning Engine API enabled to use any of the cloud functionality of Cloud ML Engine. If you are new to GCP, read the overview of projects for more information.
Cloud Compute regions
Processing resources are allocated by region and zone, which correspond to the data centers where the resources are physically located. You should typically run your one-off jobs, like model training, in the region closest to your physical location (or the physical location of your intended users), but note the following points:
Note the available regions for Cloud ML Engine services, including model training on GPUs and other hardware, and online/batch prediction.
You should always run your Cloud ML Engine jobs in the same region as the Google Cloud Storage bucket that you're using to read and write data for the job.
You must use the regional storage type for any Google Cloud Storage buckets that you're using to read and write data for your Cloud ML Engine job.
- Work through the getting-started guide to see Cloud ML Engine in action.