Deploying Models

You can host your trained models in the cloud with Cloud ML Engine and use them to get predictions. This page describes how to deploy your model. You can find a more detailed explanation on the prediction overview page.

Before you begin

To deploy a model version you'll need a TensorFlow SavedModel saved on Google Cloud Storage.

You can get a model by:

  • Following the Cloud ML Engine training steps to train in the cloud.

  • Training elsewhere and exporting to a SavedModel.

Gathering deployment parameters

You can find detailed information about the parameters you'll need to deploy your model on the prediction concepts page.

Creating a model version

When you deploy your model, you do so as a model version. You'll need to have your training artifacts in a Google Cloud Storage location before you start.

console

  1. Open the Cloud ML Engine page in the Google Cloud Platform Console

    Open models in the Cloud Platform Console

  2. If needed, create the model to add your new version to:

    1. Click Create Model.

    2. Enter a name for your model in the Model name box.

    3. Click Create.

    4. Verify that you have returned to the Models page, and that your new model appears in the list.

  3. Select your model from the list.

  4. Click Create a version under Versions on the Model details page.

  5. Fill in the form on the Create version page:

    1. Enter your model name in the Name box.

    2. Enter the Cloud Storage path to your training artifacts in the Source box.

    3. Click Create.

    4. Verify that you have returned to the Model details page, and that your new version appears in the Versions list.

gcloud

  1. If needed, create the model that you are deploying a new version of:

    gcloud ml-engine models create "model_name"
    
  2. Optionally set an environment variable to store your Cloud Storage path, which might be cumbersome to type in the next command:

    DEPLOYMENT_SOURCE="bucket_path"
    
  3. Create the version:

    gcloud ml-engine versions create "version_name"\
        --model "model_name" --origin $DEPLOYMENT_SOURCE
    
  4. Get information about your new version:

    gcloud ml-engine versions describe "your_version_name" \
        --model "your_model_name"</td>
    

    You should see output similar to this:

    createTime: '2016-09-29T16:30:45Z'
    deploymentUri: gs://your_bucket_path
    isDefault: true
    name: projects/project_name/models/model_name/versions/version_name
    

Python

  1. Import the packages required to get an auth token and to use the Cloud ML Engine APIs from the Google API Client:

    from oauth2client.client import GoogleCredentials
    from googleapiclient import discovery
    from googleapiclient import errors
    # Time is for waiting until the request finishes.
    import time
    
  2. Set variables for project and model, using the required format for the APIs (projects/project/models/model/versions/version). Also make a variable for the storage location where you put your training artifacts:

    projectID = 'projects/{}'.format('project_name')
    modelName = 'model_name'
    modelID = '{}/models/{}'.format(projectID, modelName)
    versionName = 'version_name'
    versionDescription = 'version_description'
    trainedModelLocation = 'gs://bucket_path'
    
  3. Get your application default credentials and build the Python representation of the Cloud ML Engine API:

    credentials = GoogleCredentials.get_application_default()
    ml = discovery.build('ml', 'v1', credentials=credentials)
    
  4. If needed, create the model to which this version will belongthat you are deploying a new version of:

    # Create a dictionary with the fields from the request body.
    requestDict = {'name': modelName,
        'description': 'Another model for testing.'}
    
    # Create a request to call projects.models.create.
    request = ml.projects().models().create(parent=projectID,
                                body=requestDict)
    
    # Make the call.
    try:
        response = request.execute()
    
        # Any additional code on success goes here (logging, etc.)
    
    except errors.HttpError as err:
        # Something went wrong, print out some information.
        print('There was an error creating the model.' +
            ' Check the details:')
        print(err._get_reason())
    
        # Clear the response for next time.
        response = None
    
  5. Create a dictionary with entries for the version creation job request:

    requestDict = {'name': versionName,
        'description': versionDescription,
        'deploymentUri': trainedModelLocation}
    
  6. Create the request and make the service call to create the version:

    # Create a request to call projects.models.versions.create
    request = ml.projects().models().versions().create(parent=modelID,
                  body=requestDict)
    
    # Make the call.
    try:
        response = request.execute()
    
        # Get the operation name.
        operationID = response['name']
    
        # Any additional code on success goes here (logging, etc.)
    
    except errors.HttpError as err:
        # Something went wrong, print out some information.
        print('There was an error creating the version.' +
              ' Check the details:')
        print(err._get_reason())
    
        # Handle the exception as makes sense for your application.
    
  7. Monitor the status of the create operation:

    done = False
    request = ml.projects().operations().get(name=operationID)
    
    while not done:
        response = None
    
    # Wait for 300 milliseconds.
    time.sleep(0.3)
    
    # Make the next call.
    try:
        response = request.execute()
    
        # Check for finish.
        done = response.get('done', False)
    
    except errors.HttpError as err:
        # Something went wrong, print out some information.
        print('There was an error getting the operation.' +
              'Check the details:')
        print(err._get_reason())
        done = True
    

    This example waits for version creation to finish, but you may not want to block processing in your application.

What's next

Monitor your resources on the go

Get the Google Cloud Console app to help you manage your projects.

Send feedback about...

Cloud Machine Learning Engine (Cloud ML Engine)