入門指南:使用 Keras 進行訓練和預測

Colab 標誌 在 Colab 中以筆記本的形式執行本教學課程 GitHub 標誌 在 GitHub 中查看筆記本

本教學課程說明如何使用 Keras Sequential API 在 AI Platform 中訓練類神經網路,以及如何透過這個模型提供預測。

Keras 是建立及訓練深度學習模型的高階 API,tf.keras 則是這個 API 的 TensorFlow 導入程序。

本教學課程的前兩個部分將逐步說明如何使用預先編寫的 Keras 程式碼在 AI Platform 中訓練模型、如何將訓練完成的模型部署至 AI Platform,以及如何透過已部署的模型提供線上預測。

教學課程的最後一個部分則深入介紹在這個模型中使用的訓練程式碼,並確保這個模型與 AI Platform 相容。如要進一步瞭解如何更廣泛地在 Keras 中建構機器學習模型,請參閱 TensorFlow 的 Keras 教學課程

資料集

本教學課程使用的美國人口普查收入資料集UC Irvine Machine Learning Repository 提供。這個資料集涵蓋來自 1994 年人口普查資料庫的相關統計資料,包括年齡、教育程度、婚姻狀態、職業及年薪是否超過 $50,000 美元。

目標

目標是透過 Keras 訓練深層類神經網路 (DNN) 預測某受訪者的年收入是否超過 $50,000 美元 (目標標籤),至於預測根據則是關於此個人的其他人口普查資訊 (特徵)。

本教學課程的內容偏重於透過 AI Platform 使用這個模型,而不在模型本身的設計。但不管在什麼情況下,建構機器學習系統時,還是得考慮所有可能發生的問題和未預期的後果。請參閱機器學習密集課程的公平性練習,進一步瞭解人口普查資料集的偏見來源,以及更廣泛的機器學習公平性相關資訊。

費用

本教學課程使用的 Google Cloud Platform (GCP) 計費元件包括:

  • AI Platform
  • Cloud Storage

瞭解 AI Platform 定價Cloud Storage 定價,並在 Pricing Calculator 中依據預測用量估算費用。

事前準備

在 AI Platform 訓練和部署模型之前,您必須先做好下列準備:

  • 設定本機開發環境。
  • 設定 GCP 專案並啟用帳單和必要的 API。
  • 建立 Cloud Storage 值區儲存您的訓練套件和訓練過的模型。

設定本機開發環境

您需要以下資源才能完成本教學課程:

  • Git
  • Python 3
  • virtualenv
  • Cloud SDK

Google Cloud 的設定 Python 開發環境指南提供詳細說明,可讓您瞭解應如何操作才能符合相關需求。下列是精簡版的操作步驟:

  1. 安裝 Python 3

  2. 安裝 virtualenv 並建立使用 Python 3 的虛擬環境。

  3. 啟用上個步驟建立的虛擬環境。

  4. 完成下一節所述步驟安裝 Cloud SDK。

設定 GCP 專案

  1. 登入您的 Google 帳戶。

    如果您沒有帳戶,請申請新帳戶

  2. 選取或建立 Google Cloud Platform 專案。

    前往「Manage resources」(管理資源) 頁面

  3. 請確認您已啟用 Google Cloud Platform 專案的計費功能。

    瞭解如何啟用計費功能

  4. 啟用AI Platform ("Cloud Machine Learning Engine") and Compute Engine API。

    啟用 API

  5. 安裝並初始化 Cloud SDK

驗證 GCP 帳戶

如要設定驗證方法,您必須建立服務帳戶金鑰,並為該服務帳戶金鑰的檔案路徑設定環境變數。

  1. 建立服務帳戶金鑰以用於驗證程序:
    1. 前往 GCP Console 的「Create service account key」 (建立服務帳戶金鑰) 頁面。

      前往「Create service account key」(建立服務帳戶金鑰) 頁面
    2. 自「Service account」(服務帳戶) 下拉式清單選取 [New service account] (新增服務帳戶)。
    3. 在「Service account name」(服務帳戶名稱) 欄位中輸入名稱。
    4. 在「Role」(角色) 下拉式清單中選取 [Machine Learning Engine] > [ML Engine Admin] (ML Engine 管理員) > [Storage] (儲存空間) > [Storage Object Admin] (儲存空間物件管理員)。

      注意事項:「Role」(角色) 欄位會將資源的存取權限授予服務帳戶。您稍後可以使用 GCP Console 查看及變更這個欄位。如果您要開發正式版應用程式,除了選取 [Machine Learning Engine] > [ML Engine Admin] (ML Engine 管理員) 和 [Storage] (儲存空間) > [Storage Object Admin] (儲存空間物件管理員) 之外,可能還須指定更精細的權限。詳情請參閱 AI Platform 的存取權控管相關說明。
    5. 按一下 [Create] (建立),系統隨即會將一個包含您金鑰的 JSON 檔案下載到電腦中。
  2. 將環境變數「GOOGLE_APPLICATION_CREDENTIALS」 設為包含服務帳戶金鑰的 JSON 檔案路徑。此變數僅適用於您目前的殼層工作階段,所以如果您開啟新的工作階段,請再次設定變數。

建立 Cloud Storage 值區

使用 Cloud SDK 提交訓練工作時,請將內含訓練程式碼的 Python 套件上傳至 Cloud Storage 值區。AI Platform 會透過這個套件執行程式碼。在本教學課程中,AI Platform 也會儲存相同值區中的工作所產生的已訓練模型。接著,您可以依據這個輸出結果建立 AI Platform 模型,以便提供線上預測。

將 Cloud Storage 值區名稱設為環境變數。此變數不得與任何 Cloud Storage 值區的名稱相同:

BUCKET_NAME="your-bucket-name"

選取有提供 AI Platform 訓練和預測的地區,然後建立另一個環境變數。例如:

REGION="us-central1"

在這個地區建立 Cloud Storage 值區,之後再使用相同地區進行訓練和預測。如果還沒有值區,請執行下列指令來建立:

gsutil mb -l $REGION gs://$BUCKET_NAME

在 AI Platform 中進行訓練的快速入門導覽課程

在這一節的教學課程中,我們將逐步說明如何將訓練工作提交至 AI Platform。這個工作執行的程式碼範例會使用 Keras 對美國人口普查資料進行深層類神經網路訓練,然後將經過訓練的模型輸出到 Cloud Storage 值區的 TensorFlow SavedModel 目錄

取得訓練程式碼和依附元件

請先下載訓練程式碼並變更工作目錄:

# Clone the repository of AI Platform samples
git clone --depth 1 https://github.com/GoogleCloudPlatform/cloudml-samples

# Set the working directory to the sample code directory
cd cloudml-samples/census/tf-keras

請注意,訓練程式碼在 trainer/ 子目錄中的架構是 Python 套件形式:

# `ls` shows the working directory's contents. The `p` flag adds trailing
# slashes to subdirectory names. The `R` flag lists subdirectories recursively.
ls -pR
.:
README.md  requirements.txt  trainer/

./trainer:
__init__.py  model.py  task.py  util.py

接下來,請安裝在本機環境中訓練模型所需的 Python 依附元件:

pip install -r requirements.txt

您在 AI Platform 中執行訓練工作時,系統會依據您選擇的執行階段版本預先安裝依附元件。

在本機環境中訓練模型

在 AI Platform 上訓練之前,請先在本機訓練工作,驗證檔案結構及套件封裝正確無誤。

如果是較複雜或需要耗用大量資源的工作,您可能要取一小部分的資料集樣本來進行本機訓練以驗證程式碼。然後,您才可以在 AI Platform 上對整個資料集進行訓練工作。

此樣本會透過資料集中的一小部分進行相對快速的訓練工作,讓本機訓練和 AI Platform 工作可以對相同的資料執行同一組程式碼。

執行下列指令以在本機訓練模型:

# This is similar to `python -m trainer.task --job-dir local-training-output`
# but it better replicates the AI Platform environment, especially
# for distributed training (not applicable here).
gcloud ai-platform local train \
  --package-path trainer \
  --module-name trainer.task \
  --job-dir local-training-output

在您的殼層中觀察訓練進度,訓練應用程式最後會匯出經過訓練的模型,並顯示如下所示的訊息:

Model exported to:  local-training-output/keras_export/1553709223

使用 AI Platform 訓練模型

接下來這個步驟是將訓練工作提交到 AI Platform,提交後系統會在雲端執行訓練模組,並將經過訓練的模型匯出到 Cloud Storage。

首先,請為訓練工作命名,然後選擇 Cloud Storage 值區內要用來儲存中繼和輸出結果的目錄,並設為環境變數,例如:

JOB_NAME="my_first_keras_job"
JOB_DIR="gs://$BUCKET_NAME/keras-job-dir"

執行下列指令封裝 trainer/ 目錄後,上傳至指定的 --job-dir,並指示 AI Platform 執行套件中的 trainer.task 模組。

--stream-logs 標記可讓您在殼層查看訓練記錄檔;您也可以透過 GCP Console 查看記錄檔和其他工作詳細資料。

gcloud ai-platform jobs submit training $JOB_NAME \
  --package-path trainer/ \
  --module-name trainer.task \
  --region $REGION \
  --python-version 3.5 \
  --runtime-version 1.13 \
  --job-dir $JOB_DIR \
  --stream-logs

這個訓練工作花費的時間可能會比本機訓練長,但您能以類似的方式在殼層中監控訓練進度。訓練工作最後會匯出經過訓練的模型到您的 Cloud Storage 值區,並顯示如下所示的訊息:

INFO    2019-03-27 17:57:11 +0000   master-replica-0        Model exported to:  gs://your-bucket-name/keras-job-dir/keras_export/1553709421
INFO    2019-03-27 17:57:11 +0000   master-replica-0        Module completed; cleaning up.
INFO    2019-03-27 17:57:11 +0000   master-replica-0        Clean up finished.
INFO    2019-03-27 17:57:11 +0000   master-replica-0        Task completed successfully.

超參數調整

您可以選擇使用隨附的 hptuning_config.yaml 設定檔執行超參數調整。這個檔案會指示 AI Platform 調整訓練作業的批量和學習率,以便在多次測試後達到最高準確率。

在這個範例中,訓練程式碼會使用 TensorBoard 回呼,在訓練期間建立 TensorFlow SummaryEvent。AI Platform 會透過這些事件追蹤您要改善的指標。進一步瞭解 AI Platform Training 中的超參數調整

gcloud ai-platform jobs submit training ${JOB_NAME}_hpt \
  --config hptuning_config.yaml \
  --package-path trainer/ \
  --module-name trainer.task \
  --region $REGION \
  --python-version 3.5 \
  --runtime-version 1.13 \
  --job-dir $JOB_DIR \
  --stream-logs

在 AI Platform 中進行線上預測的快速入門導覽課程

本節說明如何使用 AI Platform 及您在上個章節訓練的模型,根據其他相關的人口普查資料來預測個人收入水平。

在 AI Platform 中建立模型和版本資源

如要透過您在訓練快速入門導覽課程中訓練及匯出的模型提供線上預測,請在 AI Platform 中建立「model」資源和「version」資源。版本資源是指實際使用您的訓練模型來進行預測的資源。這個結構能讓您多次調整和重新訓練模型,並在 AI Platform 中統整所有版本。進一步瞭解模型和版本

請先命名並建立模型資源:

MODEL_NAME="my_first_keras_model"

gcloud ai-platform models create $MODEL_NAME \
  --regions $REGION
Created ml engine model [projects/your-project-id/models/my_first_keras_model].

接下來請建立模型版本。訓練快速入門導覽課程中的訓練工作會將含有時間戳記的 TensorFlow SavedModel 目錄匯出至您的 Cloud Storage 值區。AI Platform 會使用這個目錄來建立模型版本。進一步瞭解 SavedModel 和 AI Platform

您可以在訓練工作的記錄檔中找到這個目錄的路徑,路徑格式如下:

Model exported to:  gs://your-bucket-name/keras-job-dir/keras_export/1545439782

執行下列命令找出您的 SavedModel 目錄,並用這個目錄來建立模型版本資源:

MODEL_VERSION="v1"

# Get a list of directories in the `keras_export` parent directory. Then pick
# the directory with the latest timestamp, in case you've trained multiple
# times.
SAVED_MODEL_PATH=$(gsutil ls $JOB_DIR/keras_export | tail -n 1)

# Create model version based on that SavedModel directory
gcloud ai-platform versions create $MODEL_VERSION \
  --model $MODEL_NAME \
  --runtime-version 1.13 \
  --python-version 3.5 \
  --framework tensorflow \
  --origin $SAVED_MODEL_PATH

準備要用於預測作業的輸入資料

您必須預先處理要用於預測作業的輸入資料 (和預先處理訓練資料的方式一樣),才能得到有效且實用的預測結果。在實際工作環境的系統中,您可能要建立在訓練及預測期間都能以相同方式使用的預先處理管道。

至於在本練習中,則請使用訓練套件的資料載入程式碼,從評估資料選取隨機樣本。這項資料的格式是每次訓練週期後用來評估準確率的格式,這樣傳送測試預測時就不用再進一步預先處理。

從您目前的工作目錄開啟 Python 解譯器 (python),並執行接下來幾行程式碼片段:

from trainer import util

_, _, eval_x, eval_y = util.load_data()

prediction_input = eval_x.sample(20)
prediction_targets = eval_y[prediction_input.index]

prediction_input
age workclass education_num marital_status occupation relationship race capital_gain capital_loss hours_per_week native_country
1979 0.901213 1 1.525542 2 9 0 4 -0.144792 -0.217132 -0.437544 38
2430 -0.922154 3 -0.419265 4 2 3 4 -0.144792 -0.217132 -0.034039 38
4214 -1.213893 3 -0.030304 4 10 1 4 -0.144792 -0.217132 1.579979 38
10389 -0.630415 3 0.358658 4 0 3 4 -0.144792 -0.217132 -0.679647 38
14525 -1.505632 3 -1.586149 4 7 3 0 -0.144792 -0.217132 -0.034039 38
15040 -0.119873 5 0.358658 2 2 0 4 -0.144792 -0.217132 -0.841048 38
8409 0.244801 3 1.525542 2 9 0 4 -0.144792 -0.217132 1.176475 6
10628 0.098931 1 1.525542 2 9 0 4 0.886847 -0.217132 -0.034039 38
10942 0.390670 5 -0.030304 2 4 0 4 -0.144792 -0.217132 4.727315 38
5129 1.120017 3 1.136580 2 12 0 4 -0.144792 -0.217132 -0.034039 38
2096 -1.286827 3 -0.030304 4 11 3 4 -0.144792 -0.217132 -1.648058 38
12463 -0.703350 3 -0.419265 2 7 5 4 -0.144792 4.502280 -0.437544 38
8528 0.536539 3 1.525542 4 3 4 4 -0.144792 -0.217132 -0.034039 38
7093 -1.359762 3 -0.419265 4 6 3 2 -0.144792 -0.217132 -0.034039 38
12565 0.536539 3 1.136580 0 11 2 2 -0.144792 -0.217132 -0.034039 38
5655 1.338821 3 -0.419265 2 2 0 4 -0.144792 -0.217132 -0.034039 38
2322 0.682409 3 1.136580 0 12 3 4 -0.144792 -0.217132 -0.034039 38
12652 0.025997 3 1.136580 2 11 0 4 -0.144792 -0.217132 0.369465 38
4755 -0.411611 3 -0.419265 2 11 0 4 -0.144792 -0.217132 1.176475 38
4413 0.390670 6 1.136580 4 4 1 4 -0.144792 -0.217132 -0.034039 38

請注意,像 occupation 這樣的類別型欄位已轉換為整數 (和用於訓練的對應相同);數值型欄位 (例如 age) 已按比例調整為 z 分數。部分欄位已從原始資料中捨棄。以下,我們將比較相同範例的預測作業輸入內容和原始資料:

import pandas as pd

_, eval_file_path = util.download(util.DATA_DIR)
raw_eval_data = pd.read_csv(eval_file_path,
                            names=util._CSV_COLUMNS,
                            na_values='?')

raw_eval_data.iloc[prediction_input.index]
age workclass fnlwgt education education_num marital_status occupation relationship race gender capital_gain capital_loss hours_per_week native_country income_bracket
1979 51 Local-gov 99064 Masters 14 Married-civ-spouse Prof-specialty Husband White Male 0 0 35 United-States <=50K
2430 26 Private 197967 HS-grad 9 Never-married Craft-repair Own-child White Male 0 0 40 United-States <=50K
4214 22 Private 221694 Some-college 10 Never-married Protective-serv Not-in-family White Male 0 0 60 United-States <=50K
10389 30 Private 96480 Assoc-voc 11 Never-married Adm-clerical Own-child White Female 0 0 32 United-States <=50K
14525 18 Private 146225 10th 6 Never-married Other-service Own-child Amer-Indian-Eskimo Female 0 0 40 United-States <=50K
15040 37 Self-emp-not-inc 50096 Assoc-voc 11 Married-civ-spouse Craft-repair Husband White Male 0 0 30 United-States <=50K
8409 42 Private 102988 Masters 14 Married-civ-spouse Prof-specialty Husband White Male 0 0 55 Ecuador >50K
10628 40 Local-gov 284086 Masters 14 Married-civ-spouse Prof-specialty Husband White Male 7688 0 40 United-States >50K
10942 44 Self-emp-not-inc 52505 Some-college 10 Married-civ-spouse Farming-fishing Husband White Male 0 0 99 United-States <=50K
5129 54 Private 106728 Bachelors 13 Married-civ-spouse Tech-support Husband White Male 0 0 40 United-States <=50K
2096 21 Private 190916 Some-college 10 Never-married Sales Own-child White Female 0 0 20 United-States <=50K
12463 29 Private 197565 HS-grad 9 Married-civ-spouse Other-service Wife White Female 0 1902 35 United-States >50K
8528 46 Private 193188 Masters 14 Never-married Exec-managerial Unmarried White Male 0 0 40 United-States <=50K
7093 20 Private 273147 HS-grad 9 Never-married Machine-op-inspct Own-child Black Male 0 0 40 United-States <=50K
12565 46 Private 203653 Bachelors 13 Divorced Sales Other-relative Black Male 0 0 40 United-States <=50K
5655 57 Private 174662 HS-grad 9 Married-civ-spouse Craft-repair Husband White Male 0 0 40 United-States <=50K
2322 48 Private 232149 Bachelors 13 Divorced Tech-support Own-child White Female 0 0 40 United-States <=50K
12652 39 Private 82521 Bachelors 13 Married-civ-spouse Sales Husband White Male 0 0 45 United-States >50K
4755 33 Private 330715 HS-grad 9 Married-civ-spouse Sales Husband White Male 0 0 55 United-States <=50K
4413 44 State-gov 128586 Bachelors 13 Never-married Farming-fishing Not-in-family White Male 0 0 40 United-States <=50K

將預測作業的輸入內容匯出為以換行符號分隔的 JSON 檔案:

import json

with open('prediction_input.json', 'w') as json_file:
  for row in prediction_input.values.tolist():
    json.dump(row, json_file)
    json_file.write('\n')

退出 Python 解譯器 (exit()),並從您的殼層檢視 prediction_input.json

cat prediction_input.json
[0.9012127751273994, 1.0, 1.525541514460902, 2.0, 9.0, 0.0, 4.0, -0.14479173735784842, -0.21713186390175285, -0.43754385253479555, 38.0]
[-0.9221541171760282, 3.0, -0.4192650914017433, 4.0, 2.0, 3.0, 4.0, -0.14479173735784842, -0.21713186390175285, -0.03403923708700391, 38.0]
[-1.2138928199445767, 3.0, -0.030303770229214273, 4.0, 10.0, 1.0, 4.0, -0.14479173735784842, -0.21713186390175285, 1.5799792247041626, 38.0]
[-0.6304154144074798, 3.0, 0.35865755094331475, 4.0, 0.0, 3.0, 4.0, -0.14479173735784842, -0.21713186390175285, -0.6796466218034705, 38.0]
[-1.5056315227131252, 3.0, -1.5861490549193304, 4.0, 7.0, 3.0, 0.0, -0.14479173735784842, -0.21713186390175285, -0.03403923708700391, 38.0]
[-0.11987268456252011, 5.0, 0.35865755094331475, 2.0, 2.0, 0.0, 4.0, -0.14479173735784842, -0.21713186390175285, -0.8410484679825871, 38.0]
[0.24480069389816542, 3.0, 1.525541514460902, 2.0, 9.0, 0.0, 4.0, -0.14479173735784842, -0.21713186390175285, 1.176474609256371, 6.0]
[0.0989313425138912, 1.0, 1.525541514460902, 2.0, 9.0, 0.0, 4.0, 0.8868473744801746, -0.21713186390175285, -0.03403923708700391, 38.0]
[0.39067004528243965, 5.0, -0.030303770229214273, 2.0, 4.0, 0.0, 4.0, -0.14479173735784842, -0.21713186390175285, 4.7273152251969375, 38.0]
[1.1200168022038106, 3.0, 1.1365801932883728, 2.0, 12.0, 0.0, 4.0, -0.14479173735784842, -0.21713186390175285, -0.03403923708700391, 38.0]
[-1.2868274956367138, 3.0, -0.030303770229214273, 4.0, 11.0, 3.0, 4.0, -0.14479173735784842, -0.21713186390175285, -1.6480576988781703, 38.0]
[-0.7033500900996169, 3.0, -0.4192650914017433, 2.0, 7.0, 5.0, 4.0, -0.14479173735784842, 4.5022796885373735, -0.43754385253479555, 38.0]
[0.5365393966667138, 3.0, 1.525541514460902, 4.0, 3.0, 4.0, 4.0, -0.14479173735784842, -0.21713186390175285, -0.03403923708700391, 38.0]
[-1.3597621713288508, 3.0, -0.4192650914017433, 4.0, 6.0, 3.0, 2.0, -0.14479173735784842, -0.21713186390175285, -0.03403923708700391, 38.0]
[0.5365393966667138, 3.0, 1.1365801932883728, 0.0, 11.0, 2.0, 2.0, -0.14479173735784842, -0.21713186390175285, -0.03403923708700391, 38.0]
[1.338820829280222, 3.0, -0.4192650914017433, 2.0, 2.0, 0.0, 4.0, -0.14479173735784842, -0.21713186390175285, -0.03403923708700391, 38.0]
[0.6824087480509881, 3.0, 1.1365801932883728, 0.0, 12.0, 3.0, 4.0, -0.14479173735784842, -0.21713186390175285, -0.03403923708700391, 38.0]
[0.0259966668217541, 3.0, 1.1365801932883728, 2.0, 11.0, 0.0, 4.0, -0.14479173735784842, -0.21713186390175285, 0.3694653783607877, 38.0]
[-0.4116113873310685, 3.0, -0.4192650914017433, 2.0, 11.0, 0.0, 4.0, -0.14479173735784842, -0.21713186390175285, 1.176474609256371, 38.0]
[0.39067004528243965, 6.0, 1.1365801932883728, 4.0, 4.0, 1.0, 4.0, -0.14479173735784842, -0.21713186390175285, -0.03403923708700391, 38.0]

gcloud 指令列工具接受以換行符號分隔的 JSON 檔案來進行線上預測,且這個特定 Keras 模型會預期每個輸入範例的統一格式數字清單。

在不使用 gcloud 工具的情況下向 REST API 提出線上預測要求時,AI Platform 需要您提供其他格式。此外,用於預測的資料格式設定也可能因您建構模型架構的方式而改變。進一步瞭解如何格式化資料以進行線上預測

提交線上預測要求

使用 gcloud 提交您的線上預測要求:

gcloud ai-platform predict \
  --model $MODEL_NAME \
  --version $MODEL_VERSION \
  --json-instances prediction_input.json
DENSE_4
[0.6854287385940552]
[0.011786997318267822]
[0.037236183881759644]
[0.016223609447479248]
[0.0012015104293823242]
[0.23621389269828796]
[0.6174039244651794]
[0.9822691679000854]
[0.3815768361091614]
[0.6715215444564819]
[0.001094043254852295]
[0.43077391386032104]
[0.22132840752601624]
[0.004075437784194946]
[0.22736871242523193]
[0.4111979305744171]
[0.27328649163246155]
[0.6981356143951416]
[0.3309604525566101]
[0.20807647705078125]

由於模型的最後一層是使用 S 函數進行活化程序,因此介於 0 到 0.5 之間的輸出值代表負向預測 (「<=50K」),而 0.5 到 1 之間的輸出值則代表正向預測 (「>50K」)。

從頭開始開發 Keras 模型

到目前為止,您已經在 AI Platform 上完成機器學習模型的訓練、在 AI Platform 上將經過訓練的模型部署為版本資源,以及透過該部署作業進行線上預測。下一節將逐步說明如何重建您用來訓練模型的 Keras 程式碼,內容包括以下這些開發機器學習模型以用於 AI Platform 的相關工作:

  • 下載和預先處理資料
  • 設計和訓練模型
  • 以視覺化方式呈現及匯出經過訓練的模型

本節將針對在前幾個步驟中完成的工作提供更詳細的深入分析資料。如要進一步瞭解 tf.keras 的使用方式,請參閱 TensorFlow 的 Keras 指南。如要進一步瞭解如何將程式碼的結構設為適用於 AI Platform 的訓練套件,請參閱封裝訓練應用程式一文,並參考採用 Python 套件結構的完整訓練程式碼

匯入程式庫並定義常數

請先匯入訓練需要的 Python 程式庫:

import os
from six.moves import urllib
import tempfile

import numpy as np
import pandas as pd
import tensorflow as tf

# Examine software versions
print(__import__('sys').version)
print(tf.__version__)
print(tf.keras.__version__)

接著請定義一些實用的常數:

  • 有關下載訓練和評估資料的資訊
  • 可供 Pandas 解譯資料並將類別型欄位轉換為數值特徵的必要資訊
  • 用於訓練的超參數,例如學習率和批量
### For downloading data ###

# Storage directory
DATA_DIR = os.path.join(tempfile.gettempdir(), 'census_data')

# Download options.
DATA_URL = 'https://storage.googleapis.com/cloud-samples-data/ai-platform' \
           '/census/data'
TRAINING_FILE = 'adult.data.csv'
EVAL_FILE = 'adult.test.csv'
TRAINING_URL = '%s/%s' % (DATA_URL, TRAINING_FILE)
EVAL_URL = '%s/%s' % (DATA_URL, EVAL_FILE)

### For interpreting data ###

# These are the features in the dataset.
# Dataset information: https://archive.ics.uci.edu/ml/datasets/census+income
_CSV_COLUMNS = [
    'age', 'workclass', 'fnlwgt', 'education', 'education_num',
    'marital_status', 'occupation', 'relationship', 'race', 'gender',
    'capital_gain', 'capital_loss', 'hours_per_week', 'native_country',
    'income_bracket'
]

_CATEGORICAL_TYPES = {
  'workclass': pd.api.types.CategoricalDtype(categories=[
    'Federal-gov', 'Local-gov', 'Never-worked', 'Private', 'Self-emp-inc',
    'Self-emp-not-inc', 'State-gov', 'Without-pay'
  ]),
  'marital_status': pd.api.types.CategoricalDtype(categories=[
    'Divorced', 'Married-AF-spouse', 'Married-civ-spouse',
    'Married-spouse-absent', 'Never-married', 'Separated', 'Widowed'
  ]),
  'occupation': pd.api.types.CategoricalDtype([
    'Adm-clerical', 'Armed-Forces', 'Craft-repair', 'Exec-managerial',
    'Farming-fishing', 'Handlers-cleaners', 'Machine-op-inspct',
    'Other-service', 'Priv-house-serv', 'Prof-specialty', 'Protective-serv',
    'Sales', 'Tech-support', 'Transport-moving'
  ]),
  'relationship': pd.api.types.CategoricalDtype(categories=[
    'Husband', 'Not-in-family', 'Other-relative', 'Own-child', 'Unmarried',
    'Wife'
  ]),
  'race': pd.api.types.CategoricalDtype(categories=[
    'Amer-Indian-Eskimo', 'Asian-Pac-Islander', 'Black', 'Other', 'White'
  ]),
  'native_country': pd.api.types.CategoricalDtype(categories=[
    'Cambodia', 'Canada', 'China', 'Columbia', 'Cuba', 'Dominican-Republic',
    'Ecuador', 'El-Salvador', 'England', 'France', 'Germany', 'Greece',
    'Guatemala', 'Haiti', 'Holand-Netherlands', 'Honduras', 'Hong', 'Hungary',
    'India', 'Iran', 'Ireland', 'Italy', 'Jamaica', 'Japan', 'Laos', 'Mexico',
    'Nicaragua', 'Outlying-US(Guam-USVI-etc)', 'Peru', 'Philippines', 'Poland',
    'Portugal', 'Puerto-Rico', 'Scotland', 'South', 'Taiwan', 'Thailand',
    'Trinadad&Tobago', 'United-States', 'Vietnam', 'Yugoslavia'
  ]),
  'income_bracket': pd.api.types.CategoricalDtype(categories=[
    '<=50K', '>50K'
  ])
}

# This is the label (target) we want to predict.
_LABEL_COLUMN = 'income_bracket'

### Hyperparameters for training ###

# This the training batch size
BATCH_SIZE = 128

# This is the number of epochs (passes over the full training data)
NUM_EPOCHS = 20

# Define learning rate.
LEARNING_RATE = .01

下載及預先處理資料

下載資料

接下來,請定義用來下載訓練及評估資料的函式。這些函式也會修正資料格式設定中的輕微違規問題。

def _download_and_clean_file(filename, url):
  """Downloads data from url, and makes changes to match the CSV format.

  The CSVs may use spaces after the comma delimters (non-standard) or include
  rows which do not represent well-formed examples. This function strips out
  some of these problems.

  Args:
    filename: filename to save url to
    url: URL of resource to download
  """
  temp_file, _ = urllib.request.urlretrieve(url)
  with tf.gfile.Open(temp_file, 'r') as temp_file_object:
    with tf.gfile.Open(filename, 'w') as file_object:
      for line in temp_file_object:
        line = line.strip()
        line = line.replace(', ', ',')
        if not line or ',' not in line:
          continue
        if line[-1] == '.':
          line = line[:-1]
        line += '\n'
        file_object.write(line)
  tf.gfile.Remove(temp_file)

def download(data_dir):
  """Downloads census data if it is not already present.

  Args:
    data_dir: directory where we will access/save the census data
  """
  tf.gfile.MakeDirs(data_dir)

  training_file_path = os.path.join(data_dir, TRAINING_FILE)
  if not tf.gfile.Exists(training_file_path):
    _download_and_clean_file(training_file_path, TRAINING_URL)

  eval_file_path = os.path.join(data_dir, EVAL_FILE)
  if not tf.gfile.Exists(eval_file_path):
    _download_and_clean_file(eval_file_path, EVAL_URL)

  return training_file_path, eval_file_path

請使用這些函式來下載訓練資料,並確認您具有用於訓練和評估的 CSV 檔案:

training_file_path, eval_file_path = download(DATA_DIR)

接著使用 Pandas 載入這些檔案並檢視資料:

# This census data uses the value '?' for fields (column) that are missing data.
# We use na_values to find ? and set it to NaN values.
# https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html

train_df = pd.read_csv(training_file_path, names=_CSV_COLUMNS, na_values='?')
eval_df = pd.read_csv(eval_file_path, names=_CSV_COLUMNS, na_values='?')

下表顯示預先處理前的資料摘錄 (train_df.head()):

age workclass fnlwgt education education_num marital_status occupation relationship race gender capital_gain capital_loss hours_per_week native_country income_bracket
0 39 State-gov 77516 Bachelors 13 Never-married Adm-clerical Not-in-family White Male 2174 0 40 United-States <=50K
1 50 Self-emp-not-inc 83311 Bachelors 13 Married-civ-spouse Exec-managerial Husband White Male 0 0 13 United-States <=50K
2 38 Private 215646 HS-grad 9 Divorced Handlers-cleaners Not-in-family White Male 0 0 40 United-States <=50K
3 53 Private 234721 11th 7 Married-civ-spouse Handlers-cleaners Husband Black Male 0 0 40 United-States <=50K
4 28 Private 338409 Bachelors 13 Married-civ-spouse Prof-specialty Wife Black Female 0 0 40 Cuba <=50K

預先處理資料

預先處理的第一個步驟是將特定特徵從資料中移除,並將類別型特徵轉換為數值,以供 Keras 使用。

進一步瞭解特徵工程資料偏誤

UNUSED_COLUMNS = ['fnlwgt', 'education', 'gender']

def preprocess(dataframe):
  """Converts categorical features to numeric. Removes unused columns.

  Args:
    dataframe: Pandas dataframe with raw data

  Returns:
    Dataframe with preprocessed data
  """
  dataframe = dataframe.drop(columns=UNUSED_COLUMNS)

  # Convert integer valued (numeric) columns to floating point
  numeric_columns = dataframe.select_dtypes(['int64']).columns
  dataframe[numeric_columns] = dataframe[numeric_columns].astype('float32')

  # Convert categorical columns to numeric
  cat_columns = dataframe.select_dtypes(['object']).columns
  dataframe[cat_columns] = dataframe[cat_columns].apply(lambda x: x.astype(
    _CATEGORICAL_TYPES[x.name]))
  dataframe[cat_columns] = dataframe[cat_columns].apply(lambda x: x.cat.codes)
  return dataframe

prepped_train_df = preprocess(train_df)
prepped_eval_df = preprocess(eval_df)

下表 (prepped_train_df.head()) 顯示進行預先處理作業後的資料變化。請特別注意,您訓練模型預測的標籤 income_bracket 已從 <=50K>50K 變更為 01

age workclass education_num marital_status occupation relationship race capital_gain capital_loss hours_per_week native_country income_bracket
0 39.0 6 13.0 4 0 1 4 2174.0 0.0 40.0 38 0
1 50.0 5 13.0 2 3 0 4 0.0 0.0 13.0 38 0
2 38.0 3 9.0 0 5 1 4 0.0 0.0 40.0 38 0
3 53.0 3 7.0 2 5 0 2 0.0 0.0 40.0 38 0
4 28.0 3 13.0 2 9 5 2 0.0 0.0 40.0 4 0

接著,請將資料分割為特徵 (「x」) 和標籤 (「y」),然後將標籤陣列重新整理為適當的格式,以便稍後用於 tf.data.Dataset

# Split train and test data with labels.
# The pop() method will extract (copy) and remove the label column from the dataframe
train_x, train_y = prepped_train_df, prepped_train_df.pop(_LABEL_COLUMN)
eval_x, eval_y = prepped_eval_df, prepped_eval_df.pop(_LABEL_COLUMN)

# Reshape label columns for use with tf.data.Dataset
train_y = np.asarray(train_y).astype('float32').reshape((-1, 1))
eval_y = np.asarray(eval_y).astype('float32').reshape((-1, 1))

按比例調整訓練資料,讓數值特徵欄的平均值為 0、標準差為 1,這樣有助改善模型

在實際運作系統中,您可能會想儲存訓練集的平均值和標準差,並在預測期間用這些資料對測試資料進行完全相同的轉換。為便於本練習範例使用,我們暫時先合併訓練和評估資料以進行資料調度:

def standardize(dataframe):
  """Scales numerical columns using their means and standard deviation to get
  z-scores: the mean of each numerical column becomes 0, and the standard
  deviation becomes 1. This can help the model converge during training.

  Args:
    dataframe: Pandas dataframe

  Returns:
    Input dataframe with the numerical columns scaled to z-scores
  """
  dtypes = list(zip(dataframe.dtypes.index, map(str, dataframe.dtypes)))
  # Normalize numeric columns.
  for column, dtype in dtypes:
      if dtype == 'float32':
          dataframe[column] -= dataframe[column].mean()
          dataframe[column] /= dataframe[column].std()
  return dataframe

# Join train_x and eval_x to normalize on overall means and standard
# deviations. Then separate them again.
all_x = pd.concat([train_x, eval_x], keys=['train', 'eval'])
all_x = standardize(all_x)
train_x, eval_x = all_x.xs('train'), all_x.xs('eval')

下表 (train_x.head()) 顯示已經過完整預先處理的資料格式:

age workclass education_num marital_status occupation relationship race capital_gain capital_loss hours_per_week native_country
0 0.025997 6 1.136580 4 0 1 4 0.146933 -0.217132 -0.034039 38
1 0.828278 5 1.136580 2 3 0 4 -0.144792 -0.217132 -2.212964 38
2 -0.046938 3 -0.419265 0 5 1 4 -0.144792 -0.217132 -0.034039 38
3 1.047082 3 -1.197188 2 5 0 2 -0.144792 -0.217132 -0.034039 38
4 -0.776285 3 1.136580 2 9 5 2 -0.144792 -0.217132 -0.034039 4

設計和訓練模型

建立訓練和驗證資料集

建立輸入函式,將特徵和標籤轉換為 tf.data.Dataset 以進行訓練或評估:

def input_fn(features, labels, shuffle, num_epochs, batch_size):
  """Generates an input function to be used for model training.

  Args:
    features: numpy array of features used for training or inference
    labels: numpy array of labels for each example
    shuffle: boolean for whether to shuffle the data or not (set True for
      training, False for evaluation)
    num_epochs: number of epochs to provide the data for
    batch_size: batch size for training

  Returns:
    A tf.data.Dataset that can provide data to the Keras model for training or
      evaluation
  """
  if labels is None:
    inputs = features
  else:
    inputs = (features, labels)
  dataset = tf.data.Dataset.from_tensor_slices(inputs)

  if shuffle:
    dataset = dataset.shuffle(buffer_size=len(features))

  # We call repeat after shuffling, rather than before, to prevent separate
  # epochs from blending together.
  dataset = dataset.repeat(num_epochs)
  dataset = dataset.batch(batch_size)
  return dataset

接下來,建立這些訓練和評估資料集。使用先前定義的 NUM_EPOCHSBATCH_SIZE 超參數,定義訓練資料集如何在訓練期間提供範例給模型。針對每個訓練週期結束時的單一驗證步驟,設定驗證資料集用同一個批次提供所有範例。

# Pass a numpy array by using DataFrame.values
training_dataset = input_fn(features=train_x.values,
                    labels=train_y,
                    shuffle=True,
                    num_epochs=NUM_EPOCHS,
                    batch_size=BATCH_SIZE)

num_eval_examples = eval_x.shape[0]

# Pass a numpy array by using DataFrame.values
validation_dataset = input_fn(features=eval_x.values,
                    labels=eval_y,
                    shuffle=False,
                    num_epochs=NUM_EPOCHS,
                    batch_size=num_eval_examples)

設計 Keras 模型

使用 Keras Sequential API 設計您的類神經網路。

這個深層類神經網路 (DNN) 有好幾個隱藏層,且最後一層會使用 S 活化函數來輸出介於 0 到 1 之間的值:

  • 輸入層有 100 個使用 ReLU 活化函式的單位。
  • 隱藏層有 75 個使用 ReLU 活化函式的單位。
  • 隱藏層有 50 個使用 ReLU 活化函式的單位。
  • 隱藏層有 25 個使用 ReLU 活化函式的單位。
  • 輸出層有 1 個使用 s 活化函式的單位。
  • 最佳化器會使用二元交叉熵損失函式,適用於這裡所述的二元分類問題。

您隨時都可以變更這些層來改善模型:

def create_keras_model(input_dim, learning_rate):
  """Creates Keras Model for Binary Classification.

  Args:
    input_dim: How many features the input has
    learning_rate: Learning rate for training

  Returns:
    The compiled Keras model (still needs to be trained)
  """
  Dense = tf.keras.layers.Dense
  model = tf.keras.Sequential(
    [
        Dense(100, activation=tf.nn.relu, kernel_initializer='uniform',
                input_shape=(input_dim,)),
        Dense(75, activation=tf.nn.relu),
        Dense(50, activation=tf.nn.relu),
        Dense(25, activation=tf.nn.relu),
        Dense(1, activation=tf.nn.sigmoid)
    ])

  # Custom Optimizer:
  # https://www.tensorflow.org/api_docs/python/tf/train/RMSPropOptimizer
  optimizer = tf.keras.optimizers.RMSprop(
      lr=learning_rate)

  # Compile Keras model
  model.compile(
      loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
  return model

接下來要建立 Keras 模型物件:

num_train_examples, input_dim = train_x.shape
print('Number of features: {}'.format(input_dim))
print('Number of examples: {}'.format(num_train_examples))

keras_model = create_keras_model(
    input_dim=input_dim,
    learning_rate=LEARNING_RATE)

使用 keras_model.summary() 檢查模型應會傳回類似如下的內容:

Number of features: 11
Number of examples: 32561
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/resource_variable_ops.py:435: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
dense (Dense)                (None, 100)               1200
_________________________________________________________________
dense_1 (Dense)              (None, 75)                7575
_________________________________________________________________
dense_2 (Dense)              (None, 50)                3800
_________________________________________________________________
dense_3 (Dense)              (None, 25)                1275
_________________________________________________________________
dense_4 (Dense)              (None, 1)                 26
=================================================================
Total params: 13,876
Trainable params: 13,876
Non-trainable params: 0
_________________________________________________________________

訓練和評估模型

定義學習率衰減,以利參數執行變更的幅度隨著訓練進展而降低:

# Setup Learning Rate decay.
lr_decay_cb = tf.keras.callbacks.LearningRateScheduler(
    lambda epoch: LEARNING_RATE + 0.02 * (0.5 ** (1 + epoch)),
    verbose=True)

# Setup TensorBoard callback.
JOB_DIR = os.getenv('JOB_DIR')
tensorboard_cb = tf.keras.callbacks.TensorBoard(
      os.path.join(JOB_DIR, 'keras_tensorboard'),
      histogram_freq=1)

最後我們要來訓練模型。為模型提供適當的 steps_per_epoch,以便在每個訓練週期對整個訓練資料集 (每步驟需要 BATCH_SIZE 個範例) 進行訓練工作。然後指示模型在每個訓練週期結束時,透過一個較大的驗證批次計算驗證準確率。

history = keras_model.fit(training_dataset,
                          epochs=NUM_EPOCHS,
                          steps_per_epoch=int(num_train_examples/BATCH_SIZE),
                          validation_data=validation_dataset,
                          validation_steps=1,
                          callbacks=[lr_decay_cb, tensorboard_cb],
                          verbose=1)

訓練進度看起來可能如下所示:

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.

Epoch 00001: LearningRateScheduler reducing learning rate to 0.02.
Epoch 1/20
254/254 [==============================] - 1s 5ms/step - loss: 0.6986 - acc: 0.7893 - val_loss: 0.3894 - val_acc: 0.8329

Epoch 00002: LearningRateScheduler reducing learning rate to 0.015.
Epoch 2/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3574 - acc: 0.8335 - val_loss: 0.3861 - val_acc: 0.8131

...

Epoch 00019: LearningRateScheduler reducing learning rate to 0.010000038146972657.
Epoch 19/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3239 - acc: 0.8512 - val_loss: 0.3334 - val_acc: 0.8496

Epoch 00020: LearningRateScheduler reducing learning rate to 0.010000019073486329.
Epoch 20/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3279 - acc: 0.8504 - val_loss: 0.3174 - val_acc: 0.8523

以視覺化方式訓練工作及匯出訓練模型

以視覺化方式呈現訓練工作

匯入 matplotlib 能以視覺化方式呈現模型在訓練期間的學習過程。如有需要,請先使用 pip install matplotlib 進行安裝。

from matplotlib import pyplot as plt

描繪模型在每個訓練週期結束時計算所得的損失 (二元交叉熵) 和準確率:

# Visualize History for Loss.
plt.title('Keras model loss')
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['training', 'validation'], loc='upper right')
plt.show()

# Visualize History for Accuracy.
plt.title('Keras model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.legend(['training', 'validation'], loc='lower right')
plt.show()

損失率會隨著時間降低,而準確率則是逐漸上升;但這種情況會收斂至一個穩定水平嗎?訓練和驗證指標之間是否有巨大差異 (過度配適的跡象)?

瞭解如何改善您的機器學習模型。接著您可以調整超參數或模型架構,然後再次進行訓練。

匯出模型以供使用

使用 tf.contrib.saved_model.save_keras_model 匯出 TensorFlow SavedModel 目錄。這個格式與您建立模型版本資源時,AI Platform 要求您使用的格式相同。

由於並非所有最佳化器都能匯出為 SavedModel 格式,因此匯出程序期間可能會顯示警告訊息。不過只要您能成功匯出服務圖,AI Platform 就可以使用 SavedModel 來進行預測。

# Export the model to a local SavedModel directory
export_path = tf.contrib.saved_model.save_keras_model(keras_model, 'keras_export')
print("Model exported to: ", export_path)
WARNING: The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
If you depend on functionality not listed there, please file an issue.

WARNING:tensorflow:This model was compiled with a Keras optimizer (<tensorflow.python.keras.optimizers.RMSprop object at 0x7fc198c4e400>) but is being saved in TensorFlow format with `save_weights`. The model's weights will be saved, but unlike with TensorFlow optimizers in the TensorFlow format the optimizer's state will not be saved.

Consider using a TensorFlow optimizer from `tf.train`.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/network.py:1436: update_checkpoint_state (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.train.CheckpointManager to manage checkpoints rather than manually editing the Checkpoint proto.
WARNING:tensorflow:Model was compiled with an optimizer, but the optimizer is not from `tf.train` (e.g. `tf.train.AdagradOptimizer`). Only the serving graph was exported. The train and evaluate graphs were not added to the SavedModel.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/saved_model/signature_def_utils_impl.py:205: build_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.build_tensor_info or tf.compat.v1.saved_model.build_tensor_info.
INFO:tensorflow:Signatures INCLUDED in export for Classify: None
INFO:tensorflow:Signatures INCLUDED in export for Regress: None
INFO:tensorflow:Signatures INCLUDED in export for Predict: ['serving_default']
INFO:tensorflow:Signatures INCLUDED in export for Train: None
INFO:tensorflow:Signatures INCLUDED in export for Eval: None
INFO:tensorflow:No assets to save.
INFO:tensorflow:No assets to write.
INFO:tensorflow:SavedModel written to: keras_export/1553710367/saved_model.pb
Model exported to:  b'keras_export/1553710367'

只要具備必要權限,您也可以將 SavedModel 目錄匯出到您的本機檔案系統或 Cloud Storage。在您的目前環境中,您已藉由驗證 GCP 帳戶及設定 GOOGLE_APPLICATION_CREDENTIALS 環境變數來授予 Cloud Storage 存取權。AI Platform 訓練工作也可直接匯出至 Cloud Storage,因為 AI Platform 服務帳戶本身的專案已具有 Cloud Storage 值區的存取權

試著直接匯出至 Cloud Storage:

JOB_DIR = os.getenv('JOB_DIR')

# Export the model to a SavedModel directory in Cloud Storage
export_path = tf.contrib.saved_model.save_keras_model(keras_model, JOB_DIR + '/keras_export')
print("Model exported to: ", export_path)
WARNING:tensorflow:This model was compiled with a Keras optimizer (<tensorflow.python.keras.optimizers.RMSprop object at 0x7fc198c4e400>) but is being saved in TensorFlow format with `save_weights`. The model's weights will be saved, but unlike with TensorFlow optimizers in the TensorFlow format the optimizer's state will not be saved.

Consider using a TensorFlow optimizer from `tf.train`.
WARNING:tensorflow:Model was compiled with an optimizer, but the optimizer is not from `tf.train` (e.g. `tf.train.AdagradOptimizer`). Only the serving graph was exported. The train and evaluate graphs were not added to the SavedModel.
INFO:tensorflow:Signatures INCLUDED in export for Classify: None
INFO:tensorflow:Signatures INCLUDED in export for Regress: None
INFO:tensorflow:Signatures INCLUDED in export for Predict: ['serving_default']
INFO:tensorflow:Signatures INCLUDED in export for Train: None
INFO:tensorflow:Signatures INCLUDED in export for Eval: None
INFO:tensorflow:No assets to save.
INFO:tensorflow:No assets to write.
INFO:tensorflow:SavedModel written to: gs://your-bucket-name/keras-job-dir/keras_export/1553710379/saved_model.pb
Model exported to:  b'gs://your-bucket-name/keras-job-dir/keras_export/1553710379'

您現在可將這個模型部署至 AI Platform 並開始進行預測,請按照預測快速入門導覽課程中的步驟操作。

清除所用資源

如要清除此專案中使用的所有 GCP 資源,您可以刪除用於本教學課程的 GCP 專案

或者,您也可執行以下指令來清除個別資源:

# Delete model version resource
gcloud ai-platform versions delete $MODEL_VERSION --quiet --model $MODEL_NAME

# Delete model resource
gcloud ai-platform models delete $MODEL_NAME --quiet

# Delete Cloud Storage objects that were created
gsutil -m rm -r $JOB_DIR

# If training job is still running, cancel it
gcloud ai-platform jobs cancel $JOB_NAME --quiet

如果您的 Cloud Storage 值區未包含任何其他物件,且您想要將值區刪除,請執行 gsutil rm -r gs://$BUCKET_NAME

後續步驟

  • 查看本指南中使用的完整訓練程式碼。由於架構上的設定,此程式碼可接受將自訂超參數用做指令列標記。
  • 請參閱如何為 AI Platform 訓練工作封裝程式碼
  • 瞭解如何部署模型以提供預測。
本頁內容對您是否有任何幫助?請提供意見:

傳送您對下列選項的寶貴意見...

這個網頁
Google Cloud Machine Learning 說明文件