This legacy version of AI Platform Training is deprecated and will no longer be available on Google Cloud after January 31, 2025. Migrate your resources to Vertex AI custom training to get new machine learning features that are unavailable in AI Platform.

Reference for built-in BERT algorithm

This page provides detailed reference information about arguments you submit to AI Platform Training when running a training job using the built-in BERT algorithm.

Versioning

The built-in BERT algorithm uses TensorFlow 2.3.

Data format arguments

The following arguments are used for data formatting:

Arguments	Details
`train_dataset_path`	Cloud Storage path to a TFRecord file. Required Type: String
`eval_dataset_path`	Cloud Storage path to a TFRecord file. Must have the same format as `training_data_path`. Required Type: String
`job-dir`	Cloud Storage path where model, checkpoints and other training artifacts reside. The following directories are created here: model: This contains the trained model Will also contain model training checkpoints Required Type: String

Hyperparameters

Hyperparameter	Details
BASIC PARAMETERS
`input_meta_data_path`	Google Cloud Storage path to an input metadata schema file. Required Type: String
`bert_config_file`	Google Cloud Storage path where the BERT config file is stored. Required Type: String
`initial_checkpoint`	Starting checkpoint for fine-tuning (usually a pre-trained BERT model.) Required Type:String
`mode`	Mode for algorithm run. Required Type:Enum Options:`train_and_eval, export_only`
`num_train_epochs`	Number of training epochs to run (only available in `train_and_eval` mode.) Type:Int Default:3
ADVANCED PARAMETERS
`train_batch_size`	Batch size for training. Type: Int Default: 32
`eval_batch_size`	Batch size for evaluation. Type: Int Default: 32
`steps_per_loop`	The number of steps per graph-mode loop. Type: Int Default: 200
`learning_rate`	The initial learning rate for the Adam optimizer. Type: Float Default: 0.00005
`scale_loss`	Whether or not to divide the loss by number of replica inside the per-replica loss function. Type: Boolean Default: False
`use_keras_compile_fit`	Use Keras compile /fit() API for training logic. Type: Boolean Default: False

Training using the built-in BERT algorithm