Developing a custom training application.
Training code requirements
Describes requirements to consider as you write training code.
Understanding the custom training service on AI Platform
Describes the lifecycle of a training cluster during a distributed training job, and explains how the AI Platform custom training service handles errors.
Creating a Python training application for a pre-built container
How to create a Python source distribution that contains your training application and upload it to a Cloud Storage bucket.
Pre-built containers for custom training
Provides a list of the pre-built containers for training, and describes how to use them with a Python training application.
Custom containers overview
An overview of custom containers - a Docker image that you create to run your training application.
Creating a custom container image for training
How to build a custom container image to perform custom training.
Exporting model artifacts for prediction
Requirements for saving a model in the form of one or more model artifacts so AI Platform (Unified) can serve predictions.
Overview of hyperparameter tuning
Describes the concepts involved in hyperparameter tuning, which is the automated model enhancer provided by AI Platform (Unified).
Using distributed training
How to run distributed training jobs on AI Platform.