在 Colab 中以笔记本的形式运行本教程 | 在 GitHub 上查看笔记本 |
本教程将演示 AI Platform Optimizer 多目标的优化。
目标
目标是 minimize
目标指标:y1 = r*sin(theta)
并同时 maximize
目标指标:y2 = r*cos(theta)
您将通过参数空间进行评估:
r
的取值范围是 [0,1]theta
的取值范围是 [0, pi/2]
费用
本教程使用 Google Cloud 的以下收费组件:
- AI Platform Training
- Cloud Storage
了解 AI Platform Training 价格和 Cloud Storage 价格,并使用价格计算器根据您的预计使用情况来估算费用。
PIP 安装软件包和依赖项
安装未在笔记本环境中安装的其他依赖项。
- 使用最新的主要正式版框架。
! pip install -U google-api-python-client
! pip install -U google-cloud
! pip install -U google-cloud-storage
! pip install -U requests
! pip install -U matplotlib
# Restart the kernel after pip installs
import IPython
app = IPython.Application.instance()
app.kernel.do_shutdown(True)
设置您的 Google Cloud 项目
无论您使用哪种笔记本环境,都必须执行以下步骤。
如果在您自己的机器上本地运行,则需要安装 Google Cloud SDK。
在下面的单元中输入您的项目 ID。然后运行该单元,确保 Cloud SDK 将适当的项目用于此笔记本中的所有命令。
注意:Jupyter 将前面带 !
的代码行作为 shell 命令运行,并将前面带 $
的 Python 变量插入到这些命令中。
PROJECT_ID = "[project-id]" #@param {type:"string"}
! gcloud config set project $PROJECT_ID
验证您的 Google Cloud 账号
如果您使用的是 AI Platform Notebooks,则您的环境已通过身份验证。请跳过这些步骤。
import sys
# If you are running this notebook in Colab, run this cell and follow the
# instructions to authenticate your Google Cloud account. This provides access
# to your Cloud Storage bucket and lets you submit training jobs and prediction
# requests.
if 'google.colab' in sys.modules:
from google.colab import auth as google_auth
google_auth.authenticate_user()
# If you are running this tutorial in a notebook locally, replace the string
# below with the path to your service account key and run this cell to
# authenticate your Google Cloud account.
else:
%env GOOGLE_APPLICATION_CREDENTIALS your_path_to_credentials.json
# Log in to your account on Google Cloud
!gcloud auth login
导入库
import json
import time
import datetime
from googleapiclient import errors
教程
设置
本部分定义了用于调用 AI Platform Optimizer API 的一些参数和实用程序方法。首先,请填写以下信息。
# Update to your username
USER = '[user-id]' #@param {type: 'string'}
# These will be automatically filled in.
STUDY_ID = '{}_study_{}'.format(USER, datetime.datetime.now().strftime('%Y%m%d_%H%M%S')) #@param {type: 'string'}
REGION = 'us-central1'
def study_parent():
return 'projects/{}/locations/{}'.format(PROJECT_ID, REGION)
def study_name(study_id):
return 'projects/{}/locations/{}/studies/{}'.format(PROJECT_ID, REGION, study_id)
def trial_parent(study_id):
return study_name(study_id)
def trial_name(study_id, trial_id):
return 'projects/{}/locations/{}/studies/{}/trials/{}'.format(PROJECT_ID, REGION,
study_id, trial_id)
def operation_name(operation_id):
return 'projects/{}/locations/{}/operations/{}'.format(PROJECT_ID, REGION, operation_id)
print('USER: {}'.format(USER))
print('PROJECT_ID: {}'.format(PROJECT_ID))
print('REGION: {}'.format(REGION))
print('STUDY_ID: {}'.format(STUDY_ID))
构建 API 客户端
以下单元使用 Google API 发现服务构建自动生成的 API 客户端。JSON 格式的 API 架构托管在 Cloud Storage 存储分区中。
from google.cloud import storage
from googleapiclient import discovery
_OPTIMIZER_API_DOCUMENT_BUCKET = 'caip-optimizer-public'
_OPTIMIZER_API_DOCUMENT_FILE = 'api/ml_public_google_rest_v1.json'
def read_api_document():
client = storage.Client(PROJECT_ID)
bucket = client.get_bucket(_OPTIMIZER_API_DOCUMENT_BUCKET)
blob = bucket.get_blob(_OPTIMIZER_API_DOCUMENT_FILE)
return blob.download_as_string()
ml = discovery.build_from_document(service=read_api_document())
print('Successfully built the client.')
创建研究配置
下面是一个作为分层 python 字典构建的研究配置示例。它已经填好了。请运行该单元以配置研究。
# Parameter Configuration
param_r = {
'parameter': 'r',
'type' : 'DOUBLE',
'double_value_spec' : {
'min_value' : 0,
'max_value' : 1
}
}
param_theta = {
'parameter': 'theta',
'type' : 'DOUBLE',
'double_value_spec' : {
'min_value' : 0,
'max_value' : 1.57
}
}
# Objective Metrics
metric_y1 = {
'metric' : 'y1',
'goal' : 'MINIMIZE'
}
metric_y2 = {
'metric' : 'y2',
'goal' : 'MAXIMIZE'
}
# Put it all together in a study configuration
study_config = {
'algorithm' : 'ALGORITHM_UNSPECIFIED', # Let the service choose the `default` algorithm.
'parameters' : [param_r, param_theta,],
'metrics' : [metric_y1, metric_y2,],
}
study = {'study_config': study_config}
print(json.dumps(study, indent=2, sort_keys=True))
创建研究
接下来,创建研究,您稍后将运行该研究以优化这两个目标。
# Creates a study
req = ml.projects().locations().studies().create(
parent=study_parent(), studyId=STUDY_ID, body=study)
try :
print(req.execute())
except errors.HttpError as e:
if e.resp.status == 409:
print('Study already existed.')
else:
raise e
指标评估函数
接下来,定义一些函数以评估这两个目标指标。
import math
# r * sin(theta)
def Metric1Evaluation(r, theta):
"""Evaluate the first metric on the trial."""
return r * math.sin(theta)
# r * cose(theta)
def Metric2Evaluation(r, theta):
"""Evaluate the second metric on the trial."""
return r * math.cos(theta)
def CreateMeasurement(trial_id, r, theta):
print(("=========== Start Trial: [{0}] =============").format(trial_id))
# Evaluate both objective metrics for this trial
y1 = Metric1Evaluation(r, theta)
y2 = Metric2Evaluation(r, theta)
print('[r = {0}, theta = {1}] => y1 = r*sin(theta) = {2}, y2 = r*cos(theta) = {3}'.format(r, theta, y1, y2))
metric1 = {'metric': 'y1', 'value': y1}
metric2 = {'metric': 'y2', 'value': y2}
# Return the results for this trial
measurement = {'step_count': 1, 'metrics': [metric1, metric2,]}
return measurement
设置运行试验的配置参数
client_id
- 请求建议的客户的标识符。如果多个 SuggestTrialsRequests 具有相同的 client_id
,则服务将返回相同的建议试验(如果该试验 PENDING
),并在建议的最后一个试验已完成的情况下提供一个新试验。
suggestion_count_per_request
- 每个请求中请求的建议数(试验数)。
max_trial_id_to_stop
- 停止前要探索的试验次数。将其设置为 4 可以缩短运行代码的时间,因此预计不会收敛。如需达到收敛状态,试验次数可能需要大约 20 次(一个好的做法是用总维度乘以 10)。
client_id = 'client1' #@param {type: 'string'}
suggestion_count_per_request = 5 #@param {type: 'integer'}
max_trial_id_to_stop = 50 #@param {type: 'integer'}
print('client_id: {}'.format(client_id))
print('suggestion_count_per_request: {}'.format(suggestion_count_per_request))
print('max_trial_id_to_stop: {}'.format(max_trial_id_to_stop))
运行 AI Platform Optimizer 试验
运行试验。
trial_id = 0
while trial_id < max_trial_id_to_stop:
# Requests trials.
resp = ml.projects().locations().studies().trials().suggest(
parent=trial_parent(STUDY_ID),
body={'client_id': client_id, 'suggestion_count': suggestion_count_per_request}).execute()
op_id = resp['name'].split('/')[-1]
# Polls the suggestion long-running operations.
get_op = ml.projects().locations().operations().get(name=operation_name(op_id))
while True:
operation = get_op.execute()
if 'done' in operation and operation['done']:
break
time.sleep(1)
for suggested_trial in get_op.execute()['response']['trials']:
trial_id = int(suggested_trial['name'].split('/')[-1])
# Featches the suggested trials.
trial = ml.projects().locations().studies().trials().get(name=trial_name(STUDY_ID, trial_id)).execute()
if trial['state'] in ['COMPLETED', 'INFEASIBLE']:
continue
# Parses the suggested parameters.
params = {}
for param in trial['parameters']:
if param['parameter'] == 'r':
r = param['floatValue']
elif param['parameter'] == 'theta':
theta = param['floatValue']
# Evaluates trials and reports measurement.
ml.projects().locations().studies().trials().addMeasurement(
name=trial_name(STUDY_ID, trial_id),
body={'measurement': CreateMeasurement(trial_id, r, theta)}).execute()
# Completes the trial.
ml.projects().locations().studies().trials().complete(
name=trial_name(STUDY_ID, trial_id)).execute()
[实验] 直观呈现结果
本部分提供了用于直观呈现上述研究试验的模块。
max_trials_to_annotate = 20
import matplotlib.pyplot as plt
trial_ids = []
y1 = []
y2 = []
resp = ml.projects().locations().studies().trials().list(parent=trial_parent(STUDY_ID)).execute()
for trial in resp['trials']:
if 'finalMeasurement' in trial:
trial_ids.append(int(trial['name'].split('/')[-1]))
metrics = trial['finalMeasurement']['metrics']
try:
y1.append([m for m in metrics if m['metric'] == "y1"][0]['value'])
y2.append([m for m in metrics if m['metric'] == "y2"][0]['value'])
except:
pass
fig, ax = plt.subplots()
ax.scatter(y1, y2)
plt.xlabel("y1=r*sin(theta)")
plt.ylabel("y2=r*cos(theta)");
for i, trial_id in enumerate(trial_ids):
# Only annotates the last `max_trials_to_annotate` trials
if i > len(trial_ids) - max_trials_to_annotate:
try:
ax.annotate(trial_id, (y1[i], y2[i]))
except:
pass
plt.gcf().set_size_inches((16, 16))
清理
如需清理此项目中使用的所有 Google Cloud 资源,您可以删除用于本教程的 Google Cloud 项目。