この古いバージョンの AI Platform Prediction は非推奨になりました。2025 年 1 月 31 日を過ぎると Google Cloud で使用できなくなります。2025 年 1 月 31 日以降、すべてのモデル、関連するメタデータ、デプロイが削除されます。リソースを Vertex AI に移行することで、AI Platform にはない新しい機械学習機能を利用できます。

カスタム予測ルーチン

カスタム予測ルーチンを使用すると、AI Platform Prediction にオンライン予測リクエストを送信するときに実行されるコードを指定できます。

カスタム予測ルーチンを使用せずに AI Platform Prediction にバージョンリソースをデプロイすると、トレーニングに使用した機械学習フレームワークの予測オペレーションを実行して予測リクエストが処理されます。

バージョンリソースとしてカスタム予測ルーチンをデプロイすると、受け取る予測リクエストごとにカスタム Python コードを実行するように AI Platform Prediction に指示できます。トレーニング済みのモデルで予測を行う前に予測入力の前処理を行うことも、予測結果の送信前にモデル予測の後処理を行うこともできます。

カスタム予測ルーチンを作成するには、モデルバージョンを作成するときに AI Platform Prediction に次の 2 つのパーツを提供する必要があります。

Cloud Storage のモデルディレクトリ。予測に必要なアーティファクトがすべて保存されます。
Cloud Storage の .tar.gz Python ソース配布パッケージ。Predictor インターフェースの実装と、予測時に AI Platform Prediction が使用するその他のカスタムコードが含まれます。

カスタム予測ルーチンは、モデルバージョンに以前の（MLS1）マシンタイプを使用している場合にのみデプロイできます。

モデルアーティファクトをモデルディレクトリにアップロードする

モデルのデプロイに関するガイドの説明に従って、トレーニング済みのモデルを Cloud Storage にアップロードします。また、AI Platform Prediction が予測中に使用するデータまたはステートフル性を提供するファイルも一緒にアップロードします。

AI Platform Prediction にデプロイするモデルアーティファクトのファイルサイズの合計は、500 MB 以下にする必要があります。

トレーニング済み機械学習モデルを TensorFlow SavedModel ファイル、model.joblib ファイル、model.pkl ファイル、あるいは model.bst ファイルとしてモデルディレクトリにアップロードできます。また、モデルをトレーニング済み tf.keras モデルを含む HDF5 ファイルとして、あるいは別のシリアル化形式で提供することもできます。

さらに、トレーニングでシリアル化された状態を保持するカスタムプリプロセッサクラスのインスタンスを含む pickle ファイルを保存することもできます。

たとえば、preprocess.py という名前で定義した以下のプリプロセッサについて考えてみましょう。

import numpy as np

class ZeroCenterer(object):
    """Stores means of each column of a matrix and uses them for preprocessing."""

    def __init__(self):
        """On initialization, is not tied to any distribution."""
        self._means = None

    def preprocess(self, data):
        """Transforms a matrix.

        The first time this is called, it stores the means of each column of
        the input. Then it transforms the input so each column has mean 0. For
        subsequent calls, it subtracts the stored means from each column. This
        lets you 'center' data at prediction time based on the distribution of
        the original training data.

        Args:
            data: A NumPy matrix of numerical data.

        Returns:
            A transformed matrix with the same dimensions as the input.
        """
        if self._means is None:  # during training only
            self._means = np.mean(data, axis=0)
        return data - self._means

数値データセットのトレーニング中に、プリプロセッサは、列内の各値から各列の平均値を引き、データの中心を 0 付近に揃えます。その後、pickle ファイル preprocessor.pkl としてプリプロセッサインスタンスをエクスポートできます。これにより、トレーニング済みのデータから計算された各列の平均値が維持されます。

予測中、カスタム予測ルーチンはこのファイルからプリプロセッサを読み込み、予測入力に対して同じ変換を実行します。

このようなステートフルプリプロセッサをカスタム予測ルーチンで使用する方法については、次のセクションをご覧ください。このセクションでは、予測子インターフェースの実装方法について説明します。

トレーニング時や予測時のステートフルプリプロセッサの使用に関する詳細な例については、Keras を使ったカスタム予測ルーチンの作成または scikit-learn を使ったカスタム予測ルーチンの作成をご覧ください。

予測子を作成する

予測リクエストを処理する方法を AI Platform Prediction に指示するには、Predictor クラスを使用します。これは、次のインターフェースを実装するクラスです。

class Predictor(object):
    """Interface for constructing custom predictors."""

    def predict(self, instances, **kwargs):
        """Performs custom prediction.

        Instances are the decoded values from the request. They have already
        been deserialized from JSON.

        Args:
            instances: A list of prediction input instances.
            **kwargs: A dictionary of keyword args provided as additional
                fields on the predict request body.

        Returns:
            A list of outputs containing the prediction results. This list must
            be JSON serializable.
        """
        raise NotImplementedError()

    @classmethod
    def from_path(cls, model_dir):
        """Creates an instance of Predictor using the given path.

        Loading of the predictor should be done in this method.

        Args:
            model_dir: The local directory that contains the exported model
                file along with any additional files uploaded when creating the
                version resource.

        Returns:
            An instance implementing this Predictor class.
        """
        raise NotImplementedError()

AI Platform Prediction の予測ノードは、from_path クラスメソッドを使用して、予測子のインスタンスを読み込みます。このメソッドは、モデルディレクトリに保存されているアーティファクトを読み込みます。コンテンツは、Cloud Storage から model_dir 引数で指定される場所にコピーされます。

デプロイメントがオンライン予測リクエストを受信するたびに、from_path によって返される Predictor クラスのインスタンスが predict メソッドを使用して予測を生成します。AI Platform Prediction は、このメソッドの戻り値を JSON にシリアル化し、それを予測リクエストへのレスポンスとして送信します。

predict メソッドでは、入力を JSON から Python オブジェクトに変換する必要はありません。また、出力を JSON 形式に変換する必要もありません。AI Platform Prediction はこれを predict メソッドの外部で処理します。

AI Platform Prediction は predict リクエストの本文の instances フィールドを解析して、instances 引数を AI Platform Training と Prediction API に渡します。この API は、リクエスト本文の他のすべてのフィールドを解析し、**kwargs ディクショナリのエントリとして predict メソッドに渡します。詳細については、AI Platform Training と Prediction API への predict リクエストの構築方法をご覧ください。

前のセクションの例の続きを考えましょう。モデルディレクトリには preprocessor.pkl（ピクル化された ZeroCenterer クラスのインスタンス）と、model.h5 としてエクスポートされたトレーニング済みの tf.keras モデルと model.joblib としてエクスポートされたトレーニング済みの scikit-learn モデルのいずれか一方が格納されているとします。

使用する機械学習フレームワークに応じて、predictor.py というファイルに以下のいずれかの予測子クラスを実装します。

TensorFlow

import os
import pickle

import numpy as np
from tensorflow import keras

class MyPredictor(object):
    """An example Predictor for an AI Platform custom prediction routine."""

    def __init__(self, model, preprocessor):
        """Stores artifacts for prediction. Only initialized via `from_path`."""
        self._model = model
        self._preprocessor = preprocessor

    def predict(self, instances, **kwargs):
        """Performs custom prediction.

        Preprocesses inputs, then performs prediction using the trained Keras
        model.

        Args:
            instances: A list of prediction input instances.
            **kwargs: A dictionary of keyword args provided as additional
                fields on the predict request body.

        Returns:
            A list of outputs containing the prediction results.
        """
        inputs = np.asarray(instances)
        preprocessed_inputs = self._preprocessor.preprocess(inputs)
        outputs = self._model.predict(preprocessed_inputs)
        return outputs.tolist()

    @classmethod
    def from_path(cls, model_dir):
        """Creates an instance of MyPredictor using the given path.

        This loads artifacts that have been copied from your model directory in
        Cloud Storage. MyPredictor uses them during prediction.

        Args:
            model_dir: The local directory that contains the trained Keras
                model and the pickled preprocessor instance. These are copied
                from the Cloud Storage model directory you provide when you
                deploy a version resource.

        Returns:
            An instance of `MyPredictor`.
        """
        model_path = os.path.join(model_dir, "model.h5")
        model = keras.models.load_model(model_path)

        preprocessor_path = os.path.join(model_dir, "preprocessor.pkl")
        with open(preprocessor_path, "rb") as f:
            preprocessor = pickle.load(f)

        return cls(model, preprocessor)

scikit-learn

import os
import pickle

import numpy as np
from sklearn.externals import joblib

class MyPredictor(object):
    """An example Predictor for an AI Platform custom prediction routine."""

    def __init__(self, model, preprocessor):
        """Stores artifacts for prediction. Only initialized via `from_path`."""
        self._model = model
        self._preprocessor = preprocessor

    def predict(self, instances, **kwargs):
        """Performs custom prediction.

        Preprocesses inputs, then performs prediction using the trained
        scikit-learn model.

        Args:
            instances: A list of prediction input instances.
            **kwargs: A dictionary of keyword args provided as additional
                fields on the predict request body.

        Returns:
            A list of outputs containing the prediction results.
        """
        inputs = np.asarray(instances)
        preprocessed_inputs = self._preprocessor.preprocess(inputs)
        outputs = self._model.predict(preprocessed_inputs)
        return outputs.tolist()

    @classmethod
    def from_path(cls, model_dir):
        """Creates an instance of MyPredictor using the given path.

        This loads artifacts that have been copied from your model directory in
        Cloud Storage. MyPredictor uses them during prediction.

        Args:
            model_dir: The local directory that contains the trained
                scikit-learn model and the pickled preprocessor instance. These
                are copied from the Cloud Storage model directory you provide
                when you deploy a version resource.

        Returns:
            An instance of `MyPredictor`.
        """
        model_path = os.path.join(model_dir, "model.joblib")
        model = joblib.load(model_path)

        preprocessor_path = os.path.join(model_dir, "preprocessor.pkl")
        with open(preprocessor_path, "rb") as f:
            preprocessor = pickle.load(f)

        return cls(model, preprocessor)

predict メソッドは、予測結果を返す前に、tolist メソッドを使用して予測結果をリストに変換します。NumPy 配列は JSON でシリアル化できないため、数値のリスト（JSON でシリアル化可能）に変換する必要があります。そうしなければ、AI Platform Prediction は予測レスポンスを送信できません。

予測子とその依存関係をパッケージ化する

予測子は、.tar.gz ソース配布パッケージとしてパッケージ化する必要があります。NumPy、TensorFlow、scikit-learn は AI Platform Prediction ランタイムイメージに含まれています。これらの依存関係を tarball に含める必要はありません。ただし、AI Platform Prediction にプリインストールされていない予測子の依存関係は必ず含めてください。

前述の例では、予測子が明示的にインポートしない場合でも、ソース配布パッケージに preprocess.py を含める必要があります。それ以外の場合、Python は pickle ファイルにある ZeroCenterer インスタンスのクラスを認識しないため、preprocessor = pickle.load(f) は失敗します。

以下の setup.py では、これらのスクリプトをパッケージ化しています。

from setuptools import setup

setup(name="my_custom_code", version="0.1", scripts=["predictor.py", "preprocess.py"])

このページで説明しているカスタムコードの例をパッケージ化してアップロードするには、次の操作を行います。

前のセクションで説明した preprocess.py、predictor.py、setup.py ファイルをすべて同じディレクトリに作成します。シェルで、このディレクトリに移動します。
python setup.py sdist --formats=gztar を実行して、dist/my_custom_code-0.1.tar.gz を作成します。
この tarball を Cloud Storage のステージングロケーションにアップロードします。

これは、モデルディレクトリと同じにする必要はありません。カスタム予測ルーチンの複数のバージョンを繰り返しデプロイする場合は、カスタムコードパッケージを特定のステージング階層にアップロードすることをおすすめします。異なるバージョンを追跡するようにコードを更新するときに、setup.py で version 引数の値を増やすことができます。

次のコマンドを実行すると、ソース配布パッケージが Cloud Storage にアップロードされます。
```
gsutil cp dist/my_custom_code-0.1.tar.gz gs://YOUR_BUCKET/PATH_TO_STAGING_DIR/
```

1 つまたは複数のパッケージでカスタム予測ルーチン用のコードを提供できます。