Pricing

Cloud Machine Learning Engine offers scalable, flexible pricing options to fit your project and budget. Cloud ML Engine charges you for training your models and for getting predictions, but managing your machine learning resources in the cloud is free of charge.

Pricing overview

The following tables summarize the pricing per region for training, batch prediction, and online prediction.

US

Training (US)

The cost of a training job is $0.49 per hour, per training unit.2 The number of training units is determined by the machine configuration you choose to run your job. You can choose a predefined scale tier or a custom configuration of selected machine types. See the following tables for details.

Predefined scale tiers - price per hour (and training units)
BASIC

$0.2774 (0.5661)

STANDARD_1

$2.9025 (5.9234)

PREMIUM_1

$24.1683 (49.323)

BASIC_GPU

$1.2118 (2.4731)

BASIC_TPU (Beta)

$6.8474 (13.9743)

CUSTOM

If you select CUSTOM as your scale tier, you have control over the number and type of virtual machines used for your training job. See the table of machine types.

Machine types - price per hour (and training units)
standard

$0.2774 (0.5661)

large_model

$0.6915 (1.4111)

complex_model_s

$0.4141 (0.845)

complex_model_m

$0.8281 (1.69)

complex_model_l

$1.6562 (3.38)

standard_gpu

$1.2118 (2.4731)

complex_model_m_gpu

$3.7376 (7.6278)

complex_model_l_gpu

$7.4752 (15.2555)

standard_p100

$2.6864 (5.4824)

complex_model_m_p100

$9.636 (19.6653)

standard_v100 (Beta)

$4.1756 (8.5216)

large_model_v100 (Beta)

$4.3123 (8.8005)

complex_model_m_v100 (Beta)

$15.5928 (31.8220)

complex_model_l_v100 (Beta)

$31.1856 (63.6441)

cloud_tpu (Beta)

$6.5700 (13.4082)

Batch prediction (US) $0.09262 per node hour.3
Online prediction (US) $0.056 per node hour.3

Europe

Training (Europe)

The cost of a training job is $0.54 per hour, per training unit.2 The number of training units is determined by the machine configuration you choose to run your job. You can choose a predefined scale tier or a custom configuration of selected machine types. See the following tables for details.

Predefined scale tiers - price per hour (and training units)
BASIC

$0.3212 (0.5948)

STANDARD_1

$3.3609 (6.2239)

PREMIUM_1

$27.9794 (51.8138)

BASIC_GPU

$1.3578 (2.5144)

BASIC_TPU (Beta)

(Not available)

CUSTOM

If you select CUSTOM as your scale tier, you have control over the number and type of virtual machines used for your training job. See the table of machine types.

Machine types - price per hour (and training units)
standard

$0.3212 (0.5948)

large_model

$0.8001 (1.4816)

complex_model_s

$0.4795 (0.8879)

complex_model_m

$0.9589 (1.7758)

complex_model_l

$1.9179 (3.5516)

standard_gpu

$1.3578 (2.5144)

complex_model_m_gpu

$4.1464 (7.6785)

complex_model_l_gpu

$8.2928 (15.357)

standard_p100

$2.9784 (5.5156)

complex_model_m_p100

$10.6288 (19.6830)

standard_v100 (Beta)

$4.3339 (8.0257)

large_model_v100 (Beta)

$4.4834 (8.3025)

complex_model_m_v100 (Beta)

$16.1137 (29.8402)

complex_model_l_v100 (Beta)

$32.2275 (59.6805)

cloud_tpu (Beta)

(Not available)

Batch prediction (Europe) $0.10744 per node hour.3
Online prediction (Europe) $0.061 per node hour.3

Asia Pacific

Training (Asia Pacific)

The cost of a training job is $0.54 per hour, per training unit.2 The number of training units is determined by the machine configuration you choose to run your job. You can choose a predefined scale tier or a custom configuration of selected machine types. See the following tables for details.

Predefined scale tiers - price per hour (and training units)
BASIC

$0.3212 (0.5948)

STANDARD_1

$3.3609 (6.2239)

PREMIUM_1

$27.9794 (51.8138)

BASIC_GPU

$1.3578 (2.5144)

BASIC_TPU (Beta)

(Not available)

CUSTOM

If you select CUSTOM as your scale tier, you have control over the number and type of virtual machines used for your training job. See the table of machine types.

Machine types - price per hour (and training units)
standard

$0.3212 (0.5948)

large_model

$0.8001 (1.4816)

complex_model_s

$0.4795 (0.8879)

complex_model_m

$0.9589 (1.7758)

complex_model_l

$1.9179 (3.5516)

standard_gpu

$1.3578 (2.5144)

complex_model_m_gpu

$4.1464 (7.6785)

complex_model_l_gpu

$8.2928 (15.357)

standard_p100

$2.9784 (5.5156)

complex_model_m_p100

$10.6288 (19.6830)

standard_v100 (Beta)

$4.3339 (8.0257)

large_model_v100 (Beta)

$4.4834 (8.3025)

complex_model_m_v100 (Beta)

$16.1137 (29.8402)

complex_model_l_v100 (Beta)

$32.2275 (59.6805)

cloud_tpu (Beta)

(Not available)

Batch prediction (Asia Pacific) $0.10744 per node hour.3
Online prediction (Asia Pacific) $0.071 per node hour.3

Notes:

  1. All use is subject to the Cloud ML Engine quota policy.
  2. Note the difference between training unit used on this page, and Consumed ML units shown on your Job Details page. The duration is already factored into the Consumed ML units. See the details below.
  3. A node hour represents the time your prediction job spends running on a virtual machine. Read more about node hours.
  4. You are required to store your data and program files in Google Cloud Storage buckets during the Cloud ML Engine lifecycle. See more about Cloud Storage usage.
  5. For volume-based discounts, contact the Sales team.
  6. If you pay in a currency other than USD, the prices listed in your currency on Cloud Platform SKUs apply.

The pricing calculator

Use the pricing calculator to estimate your training and prediction costs.

More about training costs

You are charged for training your models in the cloud:

  • In one-minute increments.
  • At a price per hour as shown in the above table, where the price per hour is calculated from a base price and a number of training units, determined by the processing configuration you choose when you start your training job.
  • With a minimum of 10 minutes per training job.
  • From the moment when resources are provisioned for a job until the job finishes.

Scale tiers for predefined configurations

You can control the type of processing cluster to use when training your model. The simplest way is to choose from one of the predefined configurations called scale tiers. Read more about scale tiers.

Machine types for custom configurations

If you select CUSTOM as your scale tier, you have control over the number and type of virtual machines to use for the cluster's master, worker and parameter servers. Read more about machine types.

The cost of training with a custom processing cluster is the sum of all the machines you specify. You are charged for the total time of the job, not for the active processing time of individual machines.

Examples: Calculate training cost using training units

Use training units to calculate the cost of your training job, with the following formula:

(training units * base price / 60) * job duration in minutes

Examples:

  • A data scientist in the US region runs a training job and selects the STANDARD_1 scale tier, which uses 5.9234 training units. Their job takes 15 minutes:

    (5.9234 training units * $0.49 per hour / 60) * 15 minutes
    

    For a total of $0.73 for the job.

  • A computer science professor in the US region runs a training job using the CUSTOM scale tier. They have a very large model, so they want to take advantage of the large model VMs for their parameter servers. They configure their processing cluster like this:

    • A complex_model_s machine for their master (0.845 training units.)
    • 5 parameter servers on large_model VMs (5 @ 1.4111 = 7.0555 training units).
    • 8 workers on complex_model_s VMs (8 @ 0.845 = 6.76 training units).

    Their job runs for 2 hours and 26 minutes:

    (14.6605 training units * $0.49 per hour / 60) * 146 minutes
    

    For a total of $17.48 for the job.

Examples: Calculate training cost using price per hour

Instead of training units, you can use the price per hour shown in the above table. The formula is as follows:

(Price per hour / 60) * job duration in minutes

Examples:

  • A data scientist in the US region runs a training job and selects the STANDARD_1 scale tier. Their job takes 15 minutes:

    ($2.9025 per hour / 60) * 15 minutes
    

    For a total of $0.73 for the job.

  • A computer science professor in the US region runs a training job using the CUSTOM scale tier. They have a very large model, so they want to take advantage of the large model VMs for their parameter servers. They configure their processing cluster like this:

    • A complex_model_s machine for their master ($0.4141).
    • 5 parameter servers on large_model VMs (5 @ $0.6915 = $3.4575).
    • 8 workers on complex_model_s VMs (8 @ $0.4141 = $3.3128).

    Their job runs for 2 hours and 26 minutes:

    (($0.41406 + $3.4573 + $3.31248) per hour / 60) * 146 minutes
    

    For a total of $17.48 for the job.

Examples: Calculate training cost using "Consumed ML units"

The Consumed ML units (Consumed Machine Learning units) shown on your Job details page are equivalent to training units with the duration of the job factored in. When using Consumed ML units in your calculations, use the following formula:

Consumed ML units * $0.49

Example:

  • A data scientist in the US region runs a training job. The field Consumed ML units on their Job details page shows 55.75. The calculation is as follows:

    55.75 consumed ML units * $0.49
    

    For a total of $27.32 for the job.

To find your Job details page, go to the Jobs list and click the link for a specific job.

More about prediction costs

Prediction pricing applies to requests made to trained model versions hosted by Cloud ML Engine.

You are charged:

  • For the time used on each node in the processing cluster that performs the predictions.
  • In one-minute increments.
  • Based on a price per node hour as shown in the above table.
  • With a minimum of 10 minutes per prediction job.

Node hours

The online processing resources that Cloud ML Engine uses to run your model for prediction are called nodes. You can think of a node as a virtual machine. Cloud ML Engine scales the number of nodes it uses to accommodate the work for both online and batch prediction.

You are charged for the time that your model is running on a node, including:

  • When processing a batch prediction job.
  • When processing an online prediction request.
  • When your model is in a ready state for serving online predictions.

For batch prediction:

  • The priority of the scaling is to reduce the total elapsed time of the job.
  • Scaling should have little effect on the price of your job, though there is some overhead involved in bringing up a new node.
  • You can affect scaling by setting a maximum number of nodes to use for a batch prediction job, and by setting the number of nodes to keep running for a model when you deploy it.

For online prediction:

  • The priority of the scaling is to reduce the latency of individual requests.
  • The service keeps your model in a ready state for a few idle minutes after servicing a request.
  • Scaling affects your total charges each month: the more numerous and frequent your requests, the more nodes will be used.
  • You can choose to let the service scale in response to traffic (automatic scaling) or you can specify a number of nodes to run constantly to avoid latency.
  • If you choose automatic scaling, the number of nodes scales automatically, and can scale down to zero for no-traffic durations.
  • If you choose to specify a number of nodes rather than automatic scaling, you are charged for all of the time that the nodes are running, starting at the time of deployment and persisting until you delete the model version.

Note that online prediction uses single core machines with no GPUs or other accelerators.

You can learn more about node allocation and scaling.

Examples of prediction calculations

Use the formula below to calculate your prediction cost for a month:

(Price per hour / 60) * job duration in node minutes

Example:

  • A real-estate company in the US runs a weekly prediction of housing values in areas they serve. In one month they run predictions for four weeks in batches of 3920, 4277, 3849, and 3961. Predictions take an average of 0.72 node seconds of processing.

    The cost for processing is per job (this example uses the average, but real costs use exact values for each job):

    3920 * 0.72 = 47.04 minutes
    4277 * 0.72 = 51.324 minutes
    3849 * 0.72 = 46.188 minutes
    3961 * 0.72 = 47.532 minutes
    

    Each job is greater than ten minutes, and so is charged per minute of processing:

    ($0.09262 / 60) * 48 = $0.0741
    ($0.09262 / 60) * 52 = $0.08027
    ($0.09262 / 60) * 47 = $0.07255
    ($0.09262 / 60) * 48 = $0.0741
    

    For a total charge for the month of $0.30.

The number of minutes shown in the examples does not correspond to actual elapsed time. Both batch and online prediction use one or more machines to process the data. Therefore, the actual elapsed time is typically shorter than the time expressed in node hours or minutes.

Required use of Google Cloud Storage

In addition to the costs described in this document, you are required to store data and program files in Google Cloud Storage buckets during the Cloud ML Engine lifecycle. This storage is subject to the Cloud Storage pricing policy.

Required use of Cloud Storage includes:

  • Staging your training application package.

  • Storing your training input data.

  • Staging your model files when you are ready to deploy a model version.

  • Storing your input data for batch prediction.

  • Storing the output of your batch prediction jobs. Cloud ML Engine does not require long-term storage of these items. You can remove the files as soon as the operation is complete.

  • Storing the output of your training jobs. Cloud ML Engine does not require long-term storage of these items. You can remove the files as soon as the operation is complete.

Free operations for managing your resources

The resource management operations provided by Cloud ML Engine are available free of charge. The Cloud ML Engine quota policy does limit some of these operations.

Resource Free operations
models create, get, list, delete
versions create, get, list, delete, setDefault
jobs get, list, cancel
operations get, list, cancel, delete

What's next

Was this page helpful? Let us know how we did:

Send feedback about...

Cloud Machine Learning Engine (Cloud ML Engine)