Development Environment Considerations

As part of a set of technologies that contribute to a machine learning solution, Cloud Machine Learning Engine requires a development environment with carefully configured prerequisites and dependencies. This page describes the pieces that make up your development environment and the issues that go with them. You can generally use the quickstarts to set up your environment unless you have specific needs that make it infeasible.

Operating systems

Cloud Machine Learning Engine operating system support depends on two factors: TensorFlow support and Python version.

TensorFlow currently supports macOS, Linux, and Windows. However, Windows is only supported for Python 3 development.

Cloud ML Engine runs Python 2.7.

You can develop your trainer with Python 3 on a Windows computer and then train it with Cloud ML Engine, but only if you use code that is compatible with both versions of Python. You can use compatibility libraries like six to help. Six is included in the Cloud ML Engine runtime images by default.

You could potentially use a Windows computer to start training jobs using a trainer developed in Python 2 on another machine. However, you would not be able to use the gcloud command-line tool to run local training. This is not recommended.

Root access

If you are configuring your base development environment, you may need to use sudo to run your pip installation on macOS or Linux. However, if you use a virtual environment, you won't need root access, because installation happens outside of OS-protected system directories.

Runtime environment

In addition to the operating system that you use for development, you should be aware of the operating environment that your Cloud ML Engine jobs and requests run in. The configuration of the virtual machines that run Cloud Platform project in the cloud is defined by the runtime version that you use.

Python virtual environments

Python configuration can be complicated, especially if you develop other Python applications using different technologies on the same computer. You can simplify your package and version management by using a virtual environment to do your Python development.

A Python virtual environment manages a Python interpreter and packages that are isolated from your computer's default environment and dedicated to your project. You can use virtual environments to configure separate environments for each Python project you work on, each with its own version of Python and the modules you need.

There are several options for virtual Python environments. We recommend Anaconda (or its smaller version Miniconda). These include their own virtual environment manager called Conda. Anaconda is a popular suite of packages and tools that is commonly used by data scientists.

TensorFlow

Cloud ML Engine is built to run TensorFlow training applications and models in the cloud, so it should be no surprise that TensorFlow is a hard dependency. TensorFlow provides many options for installation. If you find the options overwhelming, we recommend using Anaconda to install into a virtual environment.

Google Cloud Platform account

Along with TensorFlow, Google Cloud Platform is the most vital piece for Cloud ML Engine development. You must have an account with billing enabled and a project with Cloud Machine Learning Engine API enabled to use any of the cloud functionality of Cloud ML Engine. If you are new to Cloud Platform, read the overview of projects for more information.

Cloud Compute regions

Processing resources are allocated by region and zone, which correspond to the data centers where the resources are physically located. You should typically run your one-off jobs, like model training, in the region closest to your physical location (or the physical location of your intended users), except:

  • us-west1 is not an available region for Cloud ML Engine jobs. If your project usually uses us-west1, you'll need to specify a different region when you create a job.

  • You should always run your Cloud ML Engine jobs in the same region as the Google Cloud Storage bucket that you're using to read and write data for the job.

  • You must use Regional storage for any Google Cloud Storage

  • GPU support is currently only available in the following regions:

    • us-east1
    • asia-east1
    • europe-west1

What's next

Send feedback about...

Cloud Machine Learning Engine (Cloud ML Engine)