Class HyperparameterTuningJob (1.1.1)

HyperparameterTuningJob(
    display_name: str,
    custom_job: google.cloud.aiplatform.jobs.CustomJob,
    metric_spec: Dict[str, str],
    parameter_spec: Dict[
        str, google.cloud.aiplatform.hyperparameter_tuning._ParameterSpec
    ],
    max_trial_count: int,
    parallel_trial_count: int,
    max_failed_trial_count: int = 0,
    search_algorithm: Optional[str] = None,
    measurement_selection: Optional[str] = "best",
    project: Optional[str] = None,
    location: Optional[str] = None,
    credentials: Optional[google.auth.credentials.Credentials] = None,
    encryption_spec_key_name: Optional[str] = None,
)

Vertex AI Hyperparameter Tuning Job.

Inheritance

builtins.object > google.cloud.aiplatform.base.VertexAiResourceNoun > builtins.object > google.cloud.aiplatform.base.FutureManager > google.cloud.aiplatform.base.VertexAiResourceNounWithFutureManager > google.cloud.aiplatform.jobs._Job > google.cloud.aiplatform.jobs._RunnableJob > HyperparameterTuningJob

Properties

network

The full name of the Google Compute Engine network to which this HyperparameterTuningJob should be peered.

Takes the format projects/{project}/global/networks/{network}. Where {project} is a project number, as in 12345, and {network} is a network name.

Private services access must already be configured for the network. If left unspecified, the HyperparameterTuningJob is not peered with any network.

Methods

HyperparameterTuningJob

HyperparameterTuningJob(
    display_name: str,
    custom_job: google.cloud.aiplatform.jobs.CustomJob,
    metric_spec: Dict[str, str],
    parameter_spec: Dict[
        str, google.cloud.aiplatform.hyperparameter_tuning._ParameterSpec
    ],
    max_trial_count: int,
    parallel_trial_count: int,
    max_failed_trial_count: int = 0,
    search_algorithm: Optional[str] = None,
    measurement_selection: Optional[str] = "best",
    project: Optional[str] = None,
    location: Optional[str] = None,
    credentials: Optional[google.auth.credentials.Credentials] = None,
    encryption_spec_key_name: Optional[str] = None,
)

Configures a HyperparameterTuning Job.

Example usage:

from google.cloud.aiplatform import hyperparameter_tuning as hpt

worker_pool_specs = [
        {
            "machine_spec": {
                "machine_type": "n1-standard-4",
                "accelerator_type": "NVIDIA_TESLA_K80",
                "accelerator_count": 1,
            },
            "replica_count": 1,
            "container_spec": {
                "image_uri": container_image_uri,
                "command": [],
                "args": [],
            },
        }
    ]

custom_job = aiplatform.CustomJob(
    display_name='my_job',
    worker_pool_specs=worker_pool_specs
)


hp_job = aiplatform.HyperparameterTuningJob(
    display_name='hp-test',
    custom_job=job,
    metric_spec={
        'loss': 'minimize',
    },
    parameter_spec={
        'lr': hpt.DoubleParameterSpec(min=0.001, max=0.1, scale='log'),
        'units': hpt.IntegerParameterSpec(min=4, max=128, scale='linear'),
        'activation': hpt.CategoricalParameterSpec(values=['relu', 'selu']),
        'batch_size': hpt.DiscreteParameterSpec(values=[128, 256], scale='linear')
    },
    max_trial_count=128,
    parallel_trial_count=8,
    )

hp_job.run()

print(hp_job.trials)

For more information on using hyperparameter tuning please visit: https://cloud.google.com/ai-platform-unified/docs/training/using-hyperparameter-tuning

Parameters
NameDescription
display_name str

Required. The user-defined name of the HyperparameterTuningJob. The name can be up to 128 characters long and can be consist of any UTF-8 characters.

custom_job aiplatform.CustomJob

Required. Configured CustomJob. The worker pool spec from this custom job applies to the CustomJobs created in all the trials.

parameter_spec Dict[str, hyperparameter_tuning._ParameterSpec]

Required. Dictionary representing parameters to optimize. The dictionary key is the metric_id, which is passed into your training job as a command line key word arguemnt, and the dictionary value is the parameter specification of the metric. from google.cloud.aiplatform import hyperparameter_tuning as hpt parameter_spec={ 'decay': hpt.DoubleParameterSpec(min=1e-7, max=1, scale='linear'), 'learning_rate': hpt.DoubleParameterSpec(min=1e-7, max=1, scale='linear') 'batch_size': hpt.DiscreteParamterSpec(values=[4, 8, 16, 32, 64, 128], scale='linear') } Supported parameter specifications can be found until aiplatform.hyperparameter_tuning. These parameter specification are currently supported: DoubleParameterSpec, IntegerParameterSpec, CategoricalParameterSpace, DiscreteParameterSpec

max_trial_count int

Reuired. The desired total number of Trials.

parallel_trial_count int

Required. The desired number of Trials to run in parallel.

max_failed_trial_count int

Optional. The number of failed Trials that need to be seen before failing the HyperparameterTuningJob. If set to 0, Vertex AI decides how many Trials must fail before the whole job fails.

search_algorithm str

The search algorithm specified for the Study. Accepts one of the following: None - If you do not specify an algorithm, your job uses the default Vertex AI algorithm. The default algorithm applies Bayesian optimization to arrive at the optimal solution with a more effective search over the parameter space. 'grid' - A simple grid search within the feasible space. This option is particularly useful if you want to specify a quantity of trials that is greater than the number of points in the feasible space. In such cases, if you do not specify a grid search, the Vertex AI default algorithm may generate duplicate suggestions. To use grid search, all parameter specs must be of type IntegerParameterSpec, CategoricalParameterSpace, or DiscreteParameterSpec. 'random' - A simple random search within the feasible space.

measurement_selection str

This indicates which measurement to use if/when the service automatically selects the final measurement from previously reported intermediate measurements. Accepts: 'best', 'last' Choose this based on two considerations: A) Do you expect your measurements to monotonically improve? If so, choose 'last'. On the other hand, if you're in a situation where your system can "over-train" and you expect the performance to get better for a while but then start declining, choose 'best'. B) Are your measurements significantly noisy and/or irreproducible? If so, 'best' will tend to be over-optimistic, and it may be better to choose 'last'. If both or neither of (A) and (B) apply, it doesn't matter which selection type is chosen.

project str

Optional. Project to run the HyperparameterTuningjob in. Overrides project set in aiplatform.init.

location str

Optional. Location to run the HyperparameterTuning in. Overrides location set in aiplatform.init.

credentials auth_credentials.Credentials

Optional. Custom credentials to use to run call HyperparameterTuning service. Overrides credentials set in aiplatform.init.

encryption_spec_key_name str

Optional. Customer-managed encryption key options for a HyperparameterTuningJob. If this is set, then all resources created by the HyperparameterTuningJob will be encrypted with the provided encryption key.

run

run(
    service_account: Optional[str] = None,
    network: Optional[str] = None,
    timeout: Optional[int] = None,
    restart_job_on_worker_restart: bool = False,
    tensorboard: Optional[str] = None,
    sync: bool = True,
)

Run this configured CustomJob.

Parameters
NameDescription
service_account str

Optional. Specifies the service account for workload run-as account. Users submitting jobs must have act-as permission on this run-as account.

network str

Optional. The full name of the Compute Engine network to which the job should be peered. For example, projects/12345/global/networks/myVPC. Private services access must already be configured for the network. If left unspecified, the job is not peered with any network.

timeout int

Optional. The maximum job running time in seconds. The default is 7 days.

restart_job_on_worker_restart bool

Restarts the entire CustomJob if a worker gets restarted. This feature can be used by distributed training jobs that are not resilient to workers leaving and joining a job.

tensorboard str

Optional. The name of a Vertex AI Tensorboard resource to which this CustomJob will upload Tensorboard logs. Format: projects/{project}/locations/{location}/tensorboards/{tensorboard} The training script should write Tensorboard to following Vertex AI environment variable: AIP_TENSORBOARD_LOG_DIR service_account is required with provided tensorboard. For more information on configuring your service account please visit: https://cloud.google.com/vertex-ai/docs/experiments/tensorboard-training

sync bool

Whether to execute this method synchronously. If False, this method will unblock and it will be executed in a concurrent Future.