This legacy version of AI Platform Training is deprecated and will no longer be available on Google Cloud after January 31, 2025. Migrate your resources to Vertex AI custom training to get new machine learning features that are unavailable in AI Platform.

Reference for built-in image object detection algorithm

This page provides detailed reference information about arguments you submit to AI Platform Training when running a training job using the built-in image object detection algorithm.

Versioning

The built-in image object detection algorithm uses TensorFlow 1.14.

Data format arguments

Arguments	Details
training_data_path	Path to a TFRecord path pattern used for training. It can be in glob pattern, e.g., gs://input_data_bucket/train-001-100.tfrec gs://input_data_bucket/train* gs://input_data_bucket/train-0??-100.tfrec gs://input_data_bucket/train-00[1-9]-100.tfrec Required Type: String
validation_data_path	Path to a TFRecord path pattern used for validation. It can be in glob pattern, e.g., gs://input_data_bucket/validation-001-100.tfrec gs://input_data_bucket/validation* gs://input_data_bucket/validation-0??-100.tfrec gs://input_data_bucket/validation-00[1-9]-100.tfrec Required Type: String
job-dir	Path where model, checkpoints and other training artifacts will reside. The following directories will be created here: model: It will contain the best trained SavedModel checkpoints Required Type: String

Hyperparameters

Hyperparameter	Details
BASIC PARAMETERS
num_classes	The number of classes in the training/validation data. This must match the classes in training/validation dataset. For example, if `num_classes`=5, then the `image/object/class/label` field of each input `tf.Example` must be in the range [1, 5]. Required Type: Integer
max_steps	The number of steps that the training job will run. After `max_steps`, the training job will finish automatically. Required Type: Integer
train_batch_size	The number of images used in one training step. If this number is too big, the job may fail with out-of-memory (OOM). Default: 32 Type: Integer
num_eval_images	The number of total images used for evaluation. If it is 0, all the images in validation_data_path will be used for evaluation. Default: 0 Type: Integer
pretrained_checkpoint_path	The path to pretrained checkpoints. A good pre-trained checkpoint would be helpful to increase the model convergence speed or achieve better model quality.
Learning Rate Parameters
learning_rate_decay_type	The method by which the learning rate decays during training. Default: ‘cosine' Type: String Options: one of {cosine, stepwise}
warmup_learning_rate	The initial learning rate during warm-up phase. Default: 0 Type: Float
warmup_steps	The number of steps to run during the warm-up phase, or the length of the warm-up phase in steps. The training job uses `warmup_learning_rate` during the warm-up phase. When the warm-up phase is over, the training job uses `initial_learning_rate`. Default: 0 Type: Integer
initial_learning_rate	The initial learning rate after warmup period. Default: 0.0001 Type: Float
stepwise_learning_rate_steps	The steps to decay/change learning rates for `stepwise` learning rate decay type. For example, `100,200` means the learning rate will change (with respect to stepwise_learning_rate_levels) at step 100 and step 200. Note that it will be respected only when `learning_rate_decay_type` is set to `stepwise`. Default: 100,200 Type: String
stepwise_learning_rate_levels	The learning rate value of each step for `stepwise` learning rate decay type. Note that it will be respected only when `learning_rate_decay_type` is set to `stepwise`. Default: 0.008,0.0008 Type: String
Optimizer Parameters
optimizer_type	The optimizer used for training. Default: ‘momentum' Type: string Options: one of {momentum, adam, adadelta, adagrad, rmsprop}
Model parameters
image_size	The image size (height, weight) used for training.. Note that the training job may be OOM if its value is too big. Default: "640,640" Type: String
resnet_depth	The depth of ResNet backbone. Type: Integer Options: one of {18,34,50,101,152,200}
fpn_type	The multi-level Feature Pyramid Network (FPN) type. Type: String Options: one of {fpn, nasfpn}
bbox_aspect_ratios	The scale of size of the base anchors. Default: "1.0,2.0,0.5" Type: String
max_num_bboxes_in_training	The maximum number of proposed bboxes proposed for training. Default: 100 Type: Integer
max_num_bboxes_in_prediction	The maximum number of proposed bboxes in prediction outputs. Default: 100 Type: Integer
nms_iou_threshold	The threshold to decide whether bboxes overlap with respect to 'IOU for non-maximum suppression. Default: 0.5 Type: Float
nms_score_threshold	The threshold for deciding when to remove boxes based on score. Default: 0.05 Type: Float
focal_loss_alpha	Focal loss alpha (balancing param) value. Default: 0.25 Type: Float
focal_loss_gamma	Focal loss gamma (focusing param) value. Default: 1.5 Type: Float

Hyperparameter tuning

Hyperparameter tuning tests different hyperparameter configurations when training your model. It finds hyperparameter values that are optimal for the selected goal metric. For each tunable argument, you can specify a range of values to restrict and focus the possibilities AI Platform Training can try.

Learn more about hyperparameter tuning on AI Platform Training.

Goal metrics

For the image object detection algorithm, the only option is to maximize the average precision (AP):

Objective Metric	Direction	Details
AP	MAXIMIZE	The average precision (AP) of detection.

Tunable hyperparameters

When training with the built-in image object detection algorithm, you can tune the following hyperparameters. Start by tuning parameters with "high tunable value." These have the greatest impact on your goal metric.

HyperParameters	Type	Range/Values
PARAMETERS WITH HIGH TUNABLE VALUE
initial_learning_rate	DOUBLE	[0, inf]
learning_rate_decay_type	CategoricalValues(STRING)	one of {cosine, stepwise}
max_steps	INTEGER	[0, inf]
optimizer_type	CategoricalValues(STRING)	one of {momentum, adam, adadelta, adagrad, rmsprop}
OTHER PARAMETERS
fpn_type	String	Try use various FPN type, e.g., `nasfpn`.
bbox_aspect_ratios	CategoricalValues(STRING)	See hyperparameter section for more details.
max_num_bboxes_in_training	Integer	See hyperparameter section for more details.
max_num_bboxes_in_prediction	Integer	See hyperparameter section for more details.
nms_iou_threshold	Float	See hyperparameter section for more details.
nms_score_threshold	Float	See hyperparameter section for more details.
focal_loss_alpha	Float	See hyperparameter section for more details.
focal_loss_gamma	Float	See hyperparameter section for more details.

Training using the built-in image object detection algorithm