Run this tutorial as a notebook in Colab | View the notebook on GitHub |
This tutorial demonstrates AI Platform Optimizer multi-objective optimization.
Objective
The goal is to minimize
the objective metric:
y1 = r*sin(theta)
and simultaneously maximize
the objective metric:
y2 = r*cos(theta)
that you will evaluate over the parameter space:
r
in [0,1],theta
in [0, pi/2]
Costs
This tutorial uses billable components of Google Cloud:
- AI Platform Training
- Cloud Storage
Learn about AI Platform Training pricing and Cloud Storage pricing, and use the Pricing Calculator to generate a cost estimate based on your projected usage.
PIP install packages and dependencies
Install additional dependencies not installed in the notebook environment.
- Use the latest major GA version of the framework.
! pip install -U google-api-python-client
! pip install -U google-cloud
! pip install -U google-cloud-storage
! pip install -U requests
! pip install -U matplotlib
# Restart the kernel after pip installs
import IPython
app = IPython.Application.instance()
app.kernel.do_shutdown(True)
Set up your Google Cloud project
The following steps are required, regardless of your notebook environment.
If running locally on your own machine, you will need to install the Google Cloud SDK.
Enter your project ID in the cell below. Then run the cell to make sure the Cloud SDK uses the right project for all the commands in this notebook.
Note: Jupyter runs lines prefixed with !
as shell commands, and it interpolates Python variables prefixed with $
into these commands.
PROJECT_ID = "[project-id]" #@param {type:"string"}
! gcloud config set project $PROJECT_ID
Authenticate your Google Cloud account
If you are using AI Platform Notebooks, your environment is already authenticated. Skip these steps.
import sys
# If you are running this notebook in Colab, run this cell and follow the
# instructions to authenticate your Google Cloud account. This provides access
# to your Cloud Storage bucket and lets you submit training jobs and prediction
# requests.
if 'google.colab' in sys.modules:
from google.colab import auth as google_auth
google_auth.authenticate_user()
# If you are running this tutorial in a notebook locally, replace the string
# below with the path to your service account key and run this cell to
# authenticate your Google Cloud account.
else:
%env GOOGLE_APPLICATION_CREDENTIALS your_path_to_credentials.json
# Log in to your account on Google Cloud
!gcloud auth login
Import libraries
import json
import time
import datetime
from googleapiclient import errors
Tutorial
Setup
This section defines some parameters and util methods to call AI Platform Optimizer APIs. Please fill in the following information to get started.
# Update to your username
USER = '[user-id]' #@param {type: 'string'}
# These will be automatically filled in.
STUDY_ID = '{}_study_{}'.format(USER, datetime.datetime.now().strftime('%Y%m%d_%H%M%S')) #@param {type: 'string'}
REGION = 'us-central1'
def study_parent():
return 'projects/{}/locations/{}'.format(PROJECT_ID, REGION)
def study_name(study_id):
return 'projects/{}/locations/{}/studies/{}'.format(PROJECT_ID, REGION, study_id)
def trial_parent(study_id):
return study_name(study_id)
def trial_name(study_id, trial_id):
return 'projects/{}/locations/{}/studies/{}/trials/{}'.format(PROJECT_ID, REGION,
study_id, trial_id)
def operation_name(operation_id):
return 'projects/{}/locations/{}/operations/{}'.format(PROJECT_ID, REGION, operation_id)
print('USER: {}'.format(USER))
print('PROJECT_ID: {}'.format(PROJECT_ID))
print('REGION: {}'.format(REGION))
print('STUDY_ID: {}'.format(STUDY_ID))
Build the API client
The following cell builds the auto-generated API client using Google API discovery service. The JSON format API schema is hosted in a Cloud Storage bucket.
from google.cloud import storage
from googleapiclient import discovery
_OPTIMIZER_API_DOCUMENT_BUCKET = 'caip-optimizer-public'
_OPTIMIZER_API_DOCUMENT_FILE = 'api/ml_public_google_rest_v1.json'
def read_api_document():
client = storage.Client(PROJECT_ID)
bucket = client.get_bucket(_OPTIMIZER_API_DOCUMENT_BUCKET)
blob = bucket.get_blob(_OPTIMIZER_API_DOCUMENT_FILE)
return blob.download_as_string()
ml = discovery.build_from_document(service=read_api_document())
print('Successfully built the client.')
Create the study configuration
The following is a sample study configuration, built as a hierarchical python dictionary. It is already filled out. Run the cell to configure the study.
# Parameter Configuration
param_r = {
'parameter': 'r',
'type' : 'DOUBLE',
'double_value_spec' : {
'min_value' : 0,
'max_value' : 1
}
}
param_theta = {
'parameter': 'theta',
'type' : 'DOUBLE',
'double_value_spec' : {
'min_value' : 0,
'max_value' : 1.57
}
}
# Objective Metrics
metric_y1 = {
'metric' : 'y1',
'goal' : 'MINIMIZE'
}
metric_y2 = {
'metric' : 'y2',
'goal' : 'MAXIMIZE'
}
# Put it all together in a study configuration
study_config = {
'algorithm' : 'ALGORITHM_UNSPECIFIED', # Let the service choose the `default` algorithm.
'parameters' : [param_r, param_theta,],
'metrics' : [metric_y1, metric_y2,],
}
study = {'study_config': study_config}
print(json.dumps(study, indent=2, sort_keys=True))
Create the study
Next, create the study, which you will subsequently run to optimize the two objectives.
# Creates a study
req = ml.projects().locations().studies().create(
parent=study_parent(), studyId=STUDY_ID, body=study)
try :
print(req.execute())
except errors.HttpError as e:
if e.resp.status == 409:
print('Study already existed.')
else:
raise e
Metric evaluation functions
Next, define some functions to evaluate the two objective metrics.
import math
# r * sin(theta)
def Metric1Evaluation(r, theta):
"""Evaluate the first metric on the trial."""
return r * math.sin(theta)
# r * cose(theta)
def Metric2Evaluation(r, theta):
"""Evaluate the second metric on the trial."""
return r * math.cos(theta)
def CreateMeasurement(trial_id, r, theta):
print(("=========== Start Trial: [{0}] =============").format(trial_id))
# Evaluate both objective metrics for this trial
y1 = Metric1Evaluation(r, theta)
y2 = Metric2Evaluation(r, theta)
print('[r = {0}, theta = {1}] => y1 = r*sin(theta) = {2}, y2 = r*cos(theta) = {3}'.format(r, theta, y1, y2))
metric1 = {'metric': 'y1', 'value': y1}
metric2 = {'metric': 'y2', 'value': y2}
# Return the results for this trial
measurement = {'step_count': 1, 'metrics': [metric1, metric2,]}
return measurement
Set configuration parameters for running trials
client_id
- The identifier of the client that is requesting the suggestion. If multiple SuggestTrialsRequests have the same client_id
, the service will return the identical suggested trial if the trial is PENDING
, and provide a new trial if the last suggested trial was completed.
suggestion_count_per_request
- The number of suggestions (trials) requested in a single request.
max_trial_id_to_stop
- The number of trials to explore before stopping. It is set to 4 to shorten the time to run the code, so don't expect convergence. For convergence, it would likely need to be about 20 (a good rule of thumb is to multiply the total dimensionality by 10).
client_id = 'client1' #@param {type: 'string'}
suggestion_count_per_request = 5 #@param {type: 'integer'}
max_trial_id_to_stop = 50 #@param {type: 'integer'}
print('client_id: {}'.format(client_id))
print('suggestion_count_per_request: {}'.format(suggestion_count_per_request))
print('max_trial_id_to_stop: {}'.format(max_trial_id_to_stop))
Run AI Platform Optimizer trials
Run the trials.
trial_id = 0
while trial_id < max_trial_id_to_stop:
# Requests trials.
resp = ml.projects().locations().studies().trials().suggest(
parent=trial_parent(STUDY_ID),
body={'client_id': client_id, 'suggestion_count': suggestion_count_per_request}).execute()
op_id = resp['name'].split('/')[-1]
# Polls the suggestion long-running operations.
get_op = ml.projects().locations().operations().get(name=operation_name(op_id))
while True:
operation = get_op.execute()
if 'done' in operation and operation['done']:
break
time.sleep(1)
for suggested_trial in get_op.execute()['response']['trials']:
trial_id = int(suggested_trial['name'].split('/')[-1])
# Featches the suggested trials.
trial = ml.projects().locations().studies().trials().get(name=trial_name(STUDY_ID, trial_id)).execute()
if trial['state'] in ['COMPLETED', 'INFEASIBLE']:
continue
# Parses the suggested parameters.
params = {}
for param in trial['parameters']:
if param['parameter'] == 'r':
r = param['floatValue']
elif param['parameter'] == 'theta':
theta = param['floatValue']
# Evaluates trials and reports measurement.
ml.projects().locations().studies().trials().addMeasurement(
name=trial_name(STUDY_ID, trial_id),
body={'measurement': CreateMeasurement(trial_id, r, theta)}).execute()
# Completes the trial.
ml.projects().locations().studies().trials().complete(
name=trial_name(STUDY_ID, trial_id)).execute()
[EXPERIMENTAL] Visualize the result
This section provides a module to visualize the trials for the above study.
max_trials_to_annotate = 20
import matplotlib.pyplot as plt
trial_ids = []
y1 = []
y2 = []
resp = ml.projects().locations().studies().trials().list(parent=trial_parent(STUDY_ID)).execute()
for trial in resp['trials']:
if 'finalMeasurement' in trial:
trial_ids.append(int(trial['name'].split('/')[-1]))
metrics = trial['finalMeasurement']['metrics']
try:
y1.append([m for m in metrics if m['metric'] == "y1"][0]['value'])
y2.append([m for m in metrics if m['metric'] == "y2"][0]['value'])
except:
pass
fig, ax = plt.subplots()
ax.scatter(y1, y2)
plt.xlabel("y1=r*sin(theta)")
plt.ylabel("y2=r*cos(theta)");
for i, trial_id in enumerate(trial_ids):
# Only annotates the last `max_trials_to_annotate` trials
if i > len(trial_ids) - max_trials_to_annotate:
try:
ax.annotate(trial_id, (y1[i], y2[i]))
except:
pass
plt.gcf().set_size_inches((16, 16))
Cleaning up
To clean up all Google Cloud resources used in this project, you can delete the Google Cloud project you used for the tutorial.