このプロダクトは、次世代の AI Platform である Vertex AI でご利用いただけます。リソースを Vertex AI Vizier に移行することで、AI Platform では利用できない新しい機械学習機能を取得できます。

複数の目標の最適化

このチュートリアルを Colab のノートブックとして実行

GitHub のノートブックを表示

このチュートリアルでは、AI Platform Optimizer の複数目標の最適化について説明します。

目標

目的の指標 y1 = r*sin(theta) を minimize する

と同時に、目的の指標 y2 = r*cos(theta) を maximize することが目標です。

これらは次のパラメータ空間で評価します。

r: [0,1]
theta: [0, pi/2]

費用

このチュートリアルでは、Google Cloud の課金対象となるコンポーネントを使用します。

AI Platform Training
Cloud Storage

詳しくは、AI Platform Training の料金と Cloud Storage の料金をご覧ください。また、料金計算ツールを使用すると、予想される使用量に基づいて費用の見積もりを作成できます。

PIP インストールパッケージと依存関係

ノートブック環境にインストールされていない依存関係をインストールします。

フレームワークの最新の一般提供メジャーバージョンを使用します。

! pip install -U google-api-python-client
! pip install -U google-cloud
! pip install -U google-cloud-storage
! pip install -U requests
! pip install -U matplotlib

# Restart the kernel after pip installs
import IPython
app = IPython.Application.instance()
app.kernel.do_shutdown(True)

Google Cloud プロジェクトを設定する

ノートブック環境に関係なく、次の手順が必要です。

Google Cloud プロジェクトを選択または作成します。
プロジェクトに対して課金が有効になっていることを確認します。
AI Platform API を有効にします。
マシン上でローカルに実行している場合は、Google Cloud SDK をインストールする必要があります。
以下のセルにプロジェクト ID を入力します。次に、以下のセルを実行して、Cloud SDK がこのノートブックのすべてのコマンドに適切なプロジェクトを使用するようにします。

注: Jupyter は先頭に ! が付いた行をシェルコマンドとして実行し、これらのコマンドに先頭に $ が付いた Python 変数を挿入します。

PROJECT_ID = "[project-id]" #@param {type:"string"}
! gcloud config set project $PROJECT_ID

Google Cloud アカウントを認証する

AI Platform Notebooks を使用している場合、環境はすでに認証されています。以下の手順を省略します。

import sys

# If you are running this notebook in Colab, run this cell and follow the
# instructions to authenticate your Google Cloud account. This provides access
# to your Cloud Storage bucket and lets you submit training jobs and prediction
# requests.

if 'google.colab' in sys.modules:
    from google.colab import auth as google_auth
    google_auth.authenticate_user()

# If you are running this tutorial in a notebook locally, replace the string
# below with the path to your service account key and run this cell to
# authenticate your Google Cloud account.
else:
    %env GOOGLE_APPLICATION_CREDENTIALS your_path_to_credentials.json

# Log in to your account on Google Cloud
!gcloud auth login

ライブラリのインポート

import json
import time
import datetime
from googleapiclient import errors

チュートリアル

設定

このセクションでは、AI Platform Optimizer API を呼び出すためのパラメータとユーティリティメソッドを定義します。開始するには、次の情報を入力してください。

# Update to your username
USER = '[user-id]' #@param {type: 'string'}

# These will be automatically filled in.
STUDY_ID = '{}_study_{}'.format(USER, datetime.datetime.now().strftime('%Y%m%d_%H%M%S')) #@param {type: 'string'}
REGION = 'us-central1'

def study_parent():
  return 'projects/{}/locations/{}'.format(PROJECT_ID, REGION)


def study_name(study_id):
  return 'projects/{}/locations/{}/studies/{}'.format(PROJECT_ID, REGION, study_id)


def trial_parent(study_id):
  return study_name(study_id)


def trial_name(study_id, trial_id):
  return 'projects/{}/locations/{}/studies/{}/trials/{}'.format(PROJECT_ID, REGION,
                                                                study_id, trial_id)

def operation_name(operation_id):
  return 'projects/{}/locations/{}/operations/{}'.format(PROJECT_ID, REGION, operation_id)


print('USER: {}'.format(USER))
print('PROJECT_ID: {}'.format(PROJECT_ID))
print('REGION: {}'.format(REGION))
print('STUDY_ID: {}'.format(STUDY_ID))

API クライアントのビルド

次のセルは、Google API Discovery Service を使用して自動生成された API クライアントをビルドします。JSON 形式 API スキーマは、Cloud Storage バケットにホストされています。

from google.cloud import storage
from googleapiclient import discovery


_OPTIMIZER_API_DOCUMENT_BUCKET = 'caip-optimizer-public'
_OPTIMIZER_API_DOCUMENT_FILE = 'api/ml_public_google_rest_v1.json'


def read_api_document():
  client = storage.Client(PROJECT_ID)
  bucket = client.get_bucket(_OPTIMIZER_API_DOCUMENT_BUCKET)
  blob = bucket.get_blob(_OPTIMIZER_API_DOCUMENT_FILE)
  return blob.download_as_string()


ml = discovery.build_from_document(service=read_api_document())
print('Successfully built the client.')

スタディ構成の作成

以下は、階層的な Python 辞書として作成されたスタディ構成のサンプルです。これはすでに入力されています。セルを実行してスタディを構成します。

# Parameter Configuration
param_r = {
    'parameter': 'r',
    'type' : 'DOUBLE',
    'double_value_spec' : {
        'min_value' : 0,
        'max_value' : 1
    }
}

param_theta = {
    'parameter': 'theta',
    'type' : 'DOUBLE',
    'double_value_spec' : {
        'min_value' : 0,
        'max_value' : 1.57
    }
}

# Objective Metrics
metric_y1 = {
    'metric' : 'y1',
    'goal' : 'MINIMIZE'
}

metric_y2 = {
    'metric' : 'y2',
    'goal' : 'MAXIMIZE'
}

# Put it all together in a study configuration
study_config = {
    'algorithm' : 'ALGORITHM_UNSPECIFIED',  # Let the service choose the `default` algorithm.
    'parameters' : [param_r, param_theta,],
    'metrics' : [metric_y1, metric_y2,],
}

study = {'study_config': study_config}
print(json.dumps(study, indent=2, sort_keys=True))

スタディの作成

次に、2 つの目標を最適化するために後で実行するスタディを作成します。

# Creates a study
req = ml.projects().locations().studies().create(
    parent=study_parent(), studyId=STUDY_ID, body=study)
try :
  print(req.execute())
except errors.HttpError as e:
  if e.resp.status == 409:
    print('Study already existed.')
  else:
    raise e

指標評価関数

次に、目標とする 2 つの指標を評価する関数を定義します。

import math


# r * sin(theta)
def Metric1Evaluation(r, theta):
  """Evaluate the first metric on the trial."""
  return r * math.sin(theta)


# r * cose(theta)
def Metric2Evaluation(r, theta):
  """Evaluate the second metric on the trial."""
  return r * math.cos(theta)


def CreateMeasurement(trial_id, r, theta):
  print(("=========== Start Trial: [{0}] =============").format(trial_id))

  # Evaluate both objective metrics for this trial
  y1 = Metric1Evaluation(r, theta)
  y2 = Metric2Evaluation(r, theta)
  print('[r = {0}, theta = {1}] => y1 = r*sin(theta) = {2}, y2 = r*cos(theta) = {3}'.format(r, theta, y1, y2))
  metric1 = {'metric': 'y1', 'value': y1}
  metric2 = {'metric': 'y2', 'value': y2}

  # Return the results for this trial
  measurement = {'step_count': 1, 'metrics': [metric1, metric2,]}
  return measurement

トライアルを実行するための構成パラメータの設定

client_id - 候補をリクエストしているクライアントの ID。複数の SuggestTrialsRequest に同じ client_id が指定されていて、トライアルが PENDING であれば、サービスは同じ推奨トライアルを返します。また、最後の推奨トライアルが完了すると新しいトライアルを入力します。

suggestion_count_per_request - 1 回のリクエストでリクエストされた候補（トライアル）の数。

max_trial_id_to_stop - 停止するまでのトライアル数。コードを実行する時間を短縮するために 4 に設定されているため、収束は期待できません。収束を行うには、20 程度に設定する必要があります（次元数の合計の 10 倍をおすすめします）。

client_id = 'client1' #@param {type: 'string'}
suggestion_count_per_request =  5 #@param {type: 'integer'}
max_trial_id_to_stop =  50 #@param {type: 'integer'}

print('client_id: {}'.format(client_id))
print('suggestion_count_per_request: {}'.format(suggestion_count_per_request))
print('max_trial_id_to_stop: {}'.format(max_trial_id_to_stop))

AI Platform Optimizer のトライアルの実行

トライアルを実行します。

trial_id = 0
while trial_id < max_trial_id_to_stop:
  # Requests trials.
  resp = ml.projects().locations().studies().trials().suggest(
    parent=trial_parent(STUDY_ID),
    body={'client_id': client_id, 'suggestion_count': suggestion_count_per_request}).execute()
  op_id = resp['name'].split('/')[-1]

  # Polls the suggestion long-running operations.
  get_op = ml.projects().locations().operations().get(name=operation_name(op_id))
  while True:
      operation = get_op.execute()
      if 'done' in operation and operation['done']:
        break
      time.sleep(1)

  for suggested_trial in get_op.execute()['response']['trials']:
    trial_id = int(suggested_trial['name'].split('/')[-1])
    # Featches the suggested trials.
    trial = ml.projects().locations().studies().trials().get(name=trial_name(STUDY_ID, trial_id)).execute()
    if trial['state'] in ['COMPLETED', 'INFEASIBLE']:
      continue

    # Parses the suggested parameters.
    params = {}
    for param in trial['parameters']:
      if param['parameter'] == 'r':
        r = param['floatValue']
      elif param['parameter'] == 'theta':
        theta = param['floatValue']

    # Evaluates trials and reports measurement.
    ml.projects().locations().studies().trials().addMeasurement(
        name=trial_name(STUDY_ID, trial_id),
        body={'measurement': CreateMeasurement(trial_id, r, theta)}).execute()
    # Completes the trial.
    ml.projects().locations().studies().trials().complete(
        name=trial_name(STUDY_ID, trial_id)).execute()

結果の可視化（試験運用版）

このセクションでは、上記のスタディのトライアルを可視化するモジュールを提供します。

max_trials_to_annotate = 20

import matplotlib.pyplot as plt
trial_ids = []
y1 = []
y2 = []
resp = ml.projects().locations().studies().trials().list(parent=trial_parent(STUDY_ID)).execute()
for trial in resp['trials']:
  if 'finalMeasurement' in trial:
    trial_ids.append(int(trial['name'].split('/')[-1]))
    metrics = trial['finalMeasurement']['metrics']
    try:
        y1.append([m for m in metrics if m['metric'] == "y1"][0]['value'])
        y2.append([m for m in metrics if m['metric'] == "y2"][0]['value'])
    except:
        pass

fig, ax = plt.subplots()
ax.scatter(y1, y2)
plt.xlabel("y1=r*sin(theta)")
plt.ylabel("y2=r*cos(theta)");
for i, trial_id in enumerate(trial_ids):
  # Only annotates the last `max_trials_to_annotate` trials
  if i > len(trial_ids) - max_trials_to_annotate:
    try:
        ax.annotate(trial_id, (y1[i], y2[i]))
    except:
        pass
plt.gcf().set_size_inches((16, 16))

クリーンアップ

このプロジェクトで使用しているすべての Google Cloud リソースをクリーンアップするには、チュートリアルで使用した Google Cloud プロジェクトを削除します。