Cloud Machine Learning Engine offers scalable, flexible pricing options to help fit your project and budget. Cloud ML Engine charges for training models and getting predictions, but managing your machine learning resources in the cloud is free of charge.

Pricing overview

The following table summarizes Cloud ML Engine pricing. All use is also subject to the Cloud ML Engine quota policy.

Action US Cost Europe/Asia Cost
Model training $0.49 per hour, per ML training unit $0.54 per hour, per ML training unit
Prediction $0.10 per thousand predictions, plus $0.40 per hour $0.11 per thousand predictions, plus $0.44 per hour

If you pay in a currency other than USD, the prices listed in your currency on Cloud Platform SKUs apply.

Required use of Google Cloud Storage

In addition to the costs described in this document, you are required to store data and program files in Google Cloud Storage buckets during the Cloud ML Engine lifecycle. This storage is subject to the Cloud Storage pricing policy.

Required Cloud Storage use includes:

  • Staging your training application package when training your model.

  • Storing your training input data.

  • Staging your model files when you are ready to deploy a version.

  • Storing your input data for batch prediction.

  • Storing your batch prediction output files.

  • Storing the output of your training jobs. In addition, training and batch prediction outputs must also be stored in Cloud Storage buckets. Cloud ML Engine does not require long-term storage of these items; you can remove the files as soon as the operation is complete. Refer to the for details of the costs associated with storing data.

Free operations

The resource management operations provided by Cloud ML Engine are available free of charge. The Cloud ML Engine quota policy does limit some of these operations.

Resource Free operations
models create, get, list, delete
versions create, get, list, delete, setDefault
jobs get, list, cancel
operations get, list, cancel, delete

Training pricing

Pricing for training your models in the cloud is defined in terms of ML training units, which are an abstract measurement of the processing power involved. 1 ML training unit represents a standard machine configuration used by the training service. The number of ML training units you will be charged per hour of processing is determined by the processing configuration you choose when you start your training job.

You are charged for training:

  • In one-minute increments.
  • Based on an hourly rate, shown in the following table.
  • A minimum of 10 minutes per training job.
  • From the moment when resources are provisioned for a job until the job completes.
Region Training Cost
US $0.49 per hour, per ML training unit.
Europe and Asia $0.54 per hour, per ML training unit.

If you pay in a currency other than USD, the prices listed in your currency on Cloud Platform SKUs apply.

ML training units by scale tier

You can control the type of processing cluster to use when training your model. The simplest way is to choose from one of the predefined configurations called scale tiers. Each scale tier has a cost in ML training units:

Scale Tier ML training units
CUSTOM Sum of machine types.

Machine types for custom cluster configurations

If you select CUSTOM as your scale tier, you have control over the number and type of virtual machines to use for the cluster's master, workers and parameter servers. Each machine in a custom configuration has its own cost in ML training units per hour:

Machine type ML training units per instance
standard 1
large_model 3
complex_model_s 2
complex_model_m 3
complex_model_l 6
standard_gpu 3
complex_model_m_gpu 12
complex_model_l_gpu 24

The cost of training with a custom processing cluster is the sum of all the machines you specify. You are charged for the total time of the job, not for the active processing time of individual machines.

Training cost examples

Finding your training cost requires some calculation:

(ML training units * cost per unit / 60) * job duration in minutes

As illustrated in the following examples:

  • A data scientist in the US region runs a training job and selects the STANDARD_1 scale tier. Her job takes 15 minutes:

    (10 ML training units * $0.49 per hour / 60) * 15 minutes

    For a total of $1.23 for the job.

  • A computer science professor in the US region runs a training job using the CUSTOM scale tier. She has a very large model, so she wants to take advantage of the large model VMs for her parameter servers. She configures her processing cluster like this:

    • A complex_model_s machine for her master (2 ML training units).
    • 5 parameter servers on large_model VMs (5 @ 3 = 15 ML training units).
    • 8 workers on complex_model_s VMs (8 @ 2 = 16 ML training units).

    Her job runs for 2 hours and 26 minutes:

    (33 ML training units * $0.49 per hour / 60) * 146 minutes

    For a total of $39.35 for the job.

Prediction pricing

Prediction pricing applies to requests made to trained model versions hosted by Cloud ML Engine. You are charged a flat rate per prediction processed each month. You are also charged for the time used on each node in the processing cluster that performs the predictions.

You are charged for predictions processed according to the following table:

Region Standard rate High volume rate Notes

$0.10 per 1000 predictions

Plus $0.40 per node hour

$0.05 per 1000 predictions

Plus $0.40 per node hour

Predictions are accumulated per month and the cost is rounded to the nearest penny.

Processing is charged per minute separately for each training job, with a minimum charge of ten minutes.

High volume pricing applies to requests after the 100 millionth in a calendar month.

Europe and Asia

$0.11 per 1000 predictions

Plus $0.44 per node hour

$0.05 per 1000 predictions

Plus $0.44 per node hour

If you pay in a currency other than USD, the prices listed in your currency on Cloud Platform SKUs apply.

Understanding node hours

The online processing resources the Cloud ML Engine uses to run your model for prediction are called nodes. You can think of a node as a virtual machine. You are charged for the time that your model is running on a node, including:

  • When processing a batch prediction job.
  • When processing an online prediction request.
  • When your model is in a ready state for serving online predictions.

Cloud ML Engine scales the number of node it uses to accommodate the work both online and batch prediction. This includes the online prediction service keeping your model in a ready state for a few idle minutes after servicing a request. The priority of this scaling is:

  • For online prediction, reducing the latency of individual requests.
  • For batch prediction, reducing the total elapsed time of the job.

Scaling during batch prediction should have little effect on the price of your job, though there is some overhead involved in bringing up a new node.

Scaling during online prediction will affect your total charges each month: the more numerous and frequent your requests, the more nodes will be used.

You can affect scaling by:

  • Setting a maximum number of nodes to use for a batch prediction job.
  • Setting the number of nodes to keep running for a model when you deploy it.

You can learn more about node allocation and scaling in the prediction guide.

Prediction cost examples

Finding your prediction cost for a month requires some calculation:

(Number of predictions / 1000) * cost per 1K + (hours of processing * cost per hour)

As illustrated in the following example:

  • A real-estate company in the US runs a weekly prediction of housing values in areas they serve. In one month they run predictions for four weeks in batches of 3920, 4277, 3849, and 3961. Predictions take an average of 0.72 node seconds of processing to infer.

    The cost per prediction is cumulative:

    3920 + 4277 + 3849 + 3961 = 16007

    They are charged for 16007: (16007 / 1000) * $0.10 = $1.60

    The cost for processing is per job (this example uses the average, but real costs use exact values for each job):

    3920 * 0.72 = 47.04 minutes
    4277 * 0.72 = 51.324 minutes
    3849 * 0.72 = 46.188 minutes
    3961 * 0.72 = 47.532 minutes

    Each job is greater than ten minutes, and so is charged per minute of processing:

    (48 * $0.40) / 60 = $0.32
    (52 * $0.40) / 60 = $0.35
    (47 * $0.40) / 60 = $0.31
    (48 * $0.40) / 60 = $0.32

    For a total hourly charge of $1.30 and a total charge for the month of $2.90.

What's next

Monitor your resources on the go

Get the Google Cloud Console app to help you manage your projects.

Send feedback about...

Cloud Machine Learning Engine (Cloud ML Engine)