入門指南:使用 Keras 進行訓練和預測

Colab 標誌 在 Colab 中以記事本形式執行本教學課程 GitHub 標誌 在 GitHub 上查看記事本

本教學課程說明如何使用 Keras sequential API 在 Cloud Machine Learning Engine 上訓練類神經網路,以及如何透過這個模型進行預測。

Keras 是建構及訓練深度學習模型的高階 API, 而 tf.keras 則是 TensorFlow 導入這個 API 的程序。

本教學課程的前兩個部分將逐步說明如何使用預先撰寫的 Keras 程式碼在 Cloud ML Engine 上訓練模型,以及如何透過已部署的模型進行線上預測。

教學課程的最後一個部分則深入介紹在這個模型中使用的訓練程式碼,並確保這個模型與 Cloud ML Engine 相容。如要進一步瞭解如何更廣泛地在 Keras 中建構機器學習模型,請參閱 TensorFlow 的 Keras 教學課程

資料集

本教學課程使用由 UC Irvine Machine Learning Repository 提供的美國人口普查收入資料集。這個資料集涵蓋來自 1994 年人口普查資料庫的相關統計資料,包括年齡、教育程度、婚姻狀態、職業及年薪是否超過 $50,000 美元。

目標

目標是透過 Keras 訓練深層類神經網路 (DNN) 根據其他個人 (特徵) 相關人口資料來預測某人年收入是否超過 $50,000 美元 (目標標籤)。

本教學課程的內容偏重於透過 Cloud ML Engine 使用這個模型,而不在模型本身的設計。但不管在什麼情況下,建構機器學習系統時,還是得考慮所有可能發生的問題和未預期的後果。請參閱機器學習密集課程的公平性練習,進一步瞭解人口普查資料集的偏見來源,以及更廣泛的機器學習公平性相關資訊。

費用

本教學課程使用的 Google Cloud Platform (GCP) 計費元件包括:

  • Cloud ML Engine
  • Cloud Storage

瞭解 Cloud ML Engine 定價Cloud Storage 定價,並利用 Pricing Calculator 根據您的預測使用量來產生預估費用。

事前準備

在 Cloud ML Engine 中訓練和部署模型之前,您必須先做好以下準備:

  • 設定本機開發環境。
  • 設定 GCP 專案並啟用帳單和必要的 API。
  • 建立 Cloud Storage 值區來儲存您的訓練套件和已訓練的模型。

設定本機開發環境

您需要以下資源才能完成本教學課程:

  • Git
  • Python 3
  • virtualenv
  • Cloud SDK

如需如何才能符合相關需求條件的詳細操作說明,請參閱 Google Cloud 的「設定 Python 開發環境」指南。以下是精簡版的操作步驟:

  1. 安裝 Python 3。

  2. 安裝 virtualenv 並建立使用 Python 3 的虛擬環境。

  3. 啟用上個步驟建立的虛擬環境。

  4. 進行以下章節所述的步驟,將 Cloud SDK 安裝完成。

設定 GCP 專案

  1. 登入您的 Google 帳戶。

    如果您沒有帳戶,請申請新帳戶

  2. 選取或建立 Google Cloud Platform 專案。

    前往「Manage resources」(管理資源) 頁面

  3. 請確認您已啟用 Google Cloud Platform 專案的計費功能。

    瞭解如何啟用計費功能

  4. 啟用 Cloud Machine Learning Engine and Compute Engine API。

    啟用 API

  5. 安裝並初始化 Cloud SDK

驗證 GCP 帳戶

如要設定驗證方法,您必須建立服務帳戶金鑰,並為該服務帳戶金鑰的檔案路徑設定環境變數。

  1. 為驗證方法建立服務帳戶金鑰:
    1. 前往 GCP 主控台的「建立服務帳戶金鑰」頁面。

      前往「建立服務帳戶金鑰」頁面
    2. 自「服務帳戶」下拉式清單選取 [新增服務帳戶]
    3. 在「服務帳戶名稱」欄位中輸入名稱。
    4. 在「角色」下拉式清單中,依序選取 [Machine Learning Engine] > [ML Engine 管理員],然後再依序選取 [儲存空間] > [Storage 物件管理員]

      附註:「Role」(角色) 欄位會將資源的存取權限授予服務帳戶。您稍後可以使用 GCP 主控台查看及變更這個欄位。如果您要開發正式版應用程式,除了依序點選 [Machine Learning Engine] > [ML Engine 管理員] 和 [儲存空間] > [Storage 物件管理員] 之外,您可能須指定更精細的權限。詳情請參閱 Cloud ML Engine 的存取控制
    5. 點選 [建立],隨後一個包含您金鑰的 JSON 檔案就會下載到電腦中。
  2. 將環境變數 GOOGLE_APPLICATION_CREDENTIALS 設為包含服務帳戶金鑰的 JSON 檔案路徑。此變數僅適用於您目前的殼層工作階段,因此如果您開啟新的工作階段,就必須再次設定變數。

建立 Cloud Storage 值區

使用 Cloud SDK 提交訓練工作時,請將內含訓練程式碼的 Python 套件上傳到 Cloud Storage 值區。Cloud ML Engine 會透過這個套件執行程式碼。在本教學課程中,Cloud ML Engine 也會儲存同個值區內的工作產生的訓練模型。接著,您可以根據這個輸出結果來建立 Cloud ML Engine 模型版本,以便進行線上預測。

將您的 Cloud Storage 值區名稱設為環境變數;名稱在所有 Cloud Storage 值區中都不得重複。

BUCKET_NAME="your-bucket-name"

選取可進行 Cloud ML Engine 訓練和預測的地區,然後建立其他環境變數。您不能使用 Multi-Regional Storage 值區來進行 Cloud ML Engine 訓練,例如:

REGION="us-central1"

在這個地區中建立 Cloud Storage 值區,然後再使用同一個地區來進行訓練和預測。如果沒有值區,請執行下列指令來建立值區:

gsutil mb -l $REGION gs://$BUCKET_NAME

在 Cloud ML Engine 中進行訓練的快速入門指南

本章節為您逐步說明如何將訓練工作提交到 Cloud ML Engine。這個工作執行的程式碼範例會使用 Keras 對美國人口普查資料進行類神經網路訓練,然後將經過訓練的模型輸出到 Cloud Storage 值區的 TensorFlow SavedModel 目錄

取得訓練程式碼和依附元件

請先下載訓練程式碼並變更工作目錄:

# Clone the repository of Cloud ML Engine samples
git clone --depth 1 https://github.com/GoogleCloudPlatform/cloudml-samples

# Set the working directory to the sample code directory
cd cloudml-samples/census/tf-keras

請注意,訓練程式碼在 trainer/ 子目錄中的架構是 Python 套件形式:

# `ls` shows the working directory's contents. The `p` flag adds trailing
# slashes to subdirectory names. The `R` flag lists subdirectories recursively.
! ls -pR
.:
README.md  requirements.txt  trainer/

./trainer:
__init__.py  model.py  task.py  util.py

接下來,請安裝在本機訓練模型時必要的 Python 依附元件:

pip install -r requirements.txt

在 Cloud ML Engine 中執行訓練工作時,系統會根據您選擇的執行階段版本預先安裝依附元件。

在本機訓練模型

在 Cloud ML Engine 上訓練之前,請先在本機訓練工作,驗證檔案結構及套件封裝正確無誤。

如果是較複雜或需要耗用大量資源的工作,您可能要取一小部分的資料集樣本來進行本機訓練以驗證程式碼。然後,您才可以在 Cloud ML Engine 上對整個資料集進行訓練工作。

本範例會針對樣本數較少的資料集進行快速訓練工作,這樣本機訓練和 Cloud ML Engine 工作才能對同樣的資料執行同一組程式碼。

執行下列指令以在本機訓練模型:

# This is similar to `python -m trainer.task --job-dir local-training-output`
# but it better replicates the Cloud ML Engine environment, especially
# for distributed training (not applicable here).
gcloud ml-engine local train \
  --package-path trainer \
  --module-name trainer.task \
  --job-dir local-training-output

在您的殼層中觀察訓練進度,訓練應用程式最後會匯出經過訓練的模型,並顯示如下所示的訊息:

Model exported to:  local-training-output/keras_export/1553709223

使用 Cloud ML Engine 訓練模型

接下來這個步驟是將訓練工作提交到 Cloud ML Engine,而這會在雲端執行訓練模組,並將經過訓練的模型匯出到 Cloud Storage。

首先,請為訓練工作命名,然後選擇 Cloud Storage 值區內要用來儲存中繼和輸出結果的目錄,並設為環境變數,例如:

JOB_NAME="my_first_keras_job"
JOB_DIR="gs://$BUCKET_NAME/keras-job-dir"

執行下列指令來封裝 trainer/ 目錄,並上傳到指定的 --job-dir,然後指示 Cloud ML Engine 從該套件執行 trainer.task 模組。

--stream-logs 標記可讓您在殼層查看訓練記錄檔;您也可以透過 GCP 主控台查看記錄檔和其他工作詳細資料。

gcloud ml-engine jobs submit training $JOB_NAME \
  --package-path trainer/ \
  --module-name trainer.task \
  --region $REGION \
  --python-version 3.5 \
  --runtime-version 1.13 \
  --job-dir $JOB_DIR \
  --stream-logs

這個訓練工作花費的時間可能會比本機訓練更長,但您能以類似的方式在殼層中監控訓練進度。訓練工作最後會匯出經過訓練的模型到您的 Cloud Storage 值區,並顯示如下所示的訊息:

INFO    2019-03-27 17:57:11 +0000   master-replica-0        Model exported to:  gs://your-bucket-name/keras-job-dir/keras_export/1553709421
INFO    2019-03-27 17:57:11 +0000   master-replica-0        Module completed; cleaning up.
INFO    2019-03-27 17:57:11 +0000   master-replica-0        Clean up finished.
INFO    2019-03-27 17:57:11 +0000   master-replica-0        Task completed successfully.

在 Cloud ML Engine 中進行線上預測的快速入門指南

本節說明如何使用 Cloud ML Engine 及您在上個章節訓練的模型,根據其他相關的人口普查資料來預測個人收入水平。

在 Cloud ML Engine 中建立模型和版本資源

如要使用您在訓練快速指南中訓練和匯出的模型進行線上預測,請在 Cloud ML Engine 中建立「模型」資源及「版本」資源。版本資源是指實際使用您的訓練模型來進行預測的資源。這個結構能讓您多次調整和重新訓練模型,並在 Cloud ML Engine 中統整所有版本。進一步瞭解模型和版本

請先命名並建立模型資源:

MODEL_NAME="my_first_keras_model"

gcloud ml-engine models create $MODEL_NAME \
  --regions $REGION
Created ml engine model [projects/your-project-id/models/my_first_keras_model].

接下來請建立模型版本。訓練快速指南中的訓練工作會將有時間戳記的 TensorFlow SavedModel 目錄匯出到 Cloud Storage 值區。Cloud ML Engine 會使用這個目錄來建立模型版本。進一步瞭解 SavedModel 和 Cloud ML Engine

您可以在訓練工作的記錄檔中找到這個目錄的路徑,路徑格式如下:

Model exported to:  gs://your-bucket-name/keras-job-dir/keras_export/1545439782

執行下列命令找出您的 SavedModel 目錄,並用這個目錄來建立模型版本資源:

MODEL_VERSION="v1"

# Get a list of directories in the `keras_export` parent directory. Then pick
# the directory with the latest timestamp, in case you've trained multiple
# times.
SAVED_MODEL_PATH=$(gsutil ls $JOB_DIR/keras_export | tail -n 1)

# Create model version based on that SavedModel directory
gcloud ml-engine versions create $MODEL_VERSION \
  --model $MODEL_NAME \
  --runtime-version 1.13 \
  --python-version 3.5 \
  --framework tensorflow \
  --origin $SAVED_MODEL_PATH

準備輸出進行預測

您必須預先處理要預測的輸入資料 (和預先處理訓練資料的方式一樣),才能得到有效且實用的預測結果。在實際運作系統中,您可能要建立在訓練及預測期間都能以相同方式使用的預先處理管道。

至於本練習範例,請使用訓練套件的資料載入程式碼,從評估資料選取隨機樣本。這項資料的格式是每次訓練週期後用來評估準確率的格式,這樣傳送測試預測時就不用再進一步預先處理。

從您目前的工作目錄開啟 Python 解譯器 (python),並執行接下來幾行程式碼片段:

from trainer import util

_, _, eval_x, eval_y = util.load_data()

prediction_input = eval_x.sample(20)
prediction_targets = eval_y[prediction_input.index]

prediction_input
age workclass education_num marital_status occupation relationship race capital_gain capital_loss hours_per_week native_country
1979 0.901213 1 1.525542 2 9 0 4 -0.144792 -0.217132 -0.437544 38
2430 -0.922154 3 -0.419265 4 2 3 4 -0.144792 -0.217132 -0.034039 38
4214 -1.213893 3 -0.030304 4 10 1 4 -0.144792 -0.217132 1.579979 38
10389 -0.630415 3 0.358658 4 0 3 4 -0.144792 -0.217132 -0.679647 38
14525 -1.505632 3 -1.586149 4 7 3 0 -0.144792 -0.217132 -0.034039 38
15040 -0.119873 5 0.358658 2 2 0 4 -0.144792 -0.217132 -0.841048 38
8409 0.244801 3 1.525542 2 9 0 4 -0.144792 -0.217132 1.176475 6
10628 0.098931 1 1.525542 2 9 0 4 0.886847 -0.217132 -0.034039 38
10942 0.390670 5 -0.030304 2 4 0 4 -0.144792 -0.217132 4.727315 38
5129 1.120017 3 1.136580 2 12 0 4 -0.144792 -0.217132 -0.034039 38
2096 -1.286827 3 -0.030304 4 11 3 4 -0.144792 -0.217132 -1.648058 38
12463 -0.703350 3 -0.419265 2 7 5 4 -0.144792 4.502280 -0.437544 38
8528 0.536539 3 1.525542 4 3 4 4 -0.144792 -0.217132 -0.034039 38
7093 -1.359762 3 -0.419265 4 6 3 2 -0.144792 -0.217132 -0.034039 38
12565 0.536539 3 1.136580 0 11 2 2 -0.144792 -0.217132 -0.034039 38
5655 1.338821 3 -0.419265 2 2 0 4 -0.144792 -0.217132 -0.034039 38
2322 0.682409 3 1.136580 0 12 3 4 -0.144792 -0.217132 -0.034039 38
12652 0.025997 3 1.136580 2 11 0 4 -0.144792 -0.217132 0.369465 38
4755 -0.411611 3 -0.419265 2 11 0 4 -0.144792 -0.217132 1.176475 38
4413 0.390670 6 1.136580 4 4 1 4 -0.144792 -0.217132 -0.034039 38

請注意,像 occupation 這樣的類別型欄位已轉換為整數 (和用於訓練的對應相同);數值型欄位 (例如 age) 已按比例調整為 z 分數。部分欄位已從原始資料中捨棄。針對相同的範例比較預測輸入和原始資料:

import pandas as pd

_, eval_file_path = util.download(util.DATA_DIR)
raw_eval_data = pd.read_csv(eval_file_path,
                            names=util._CSV_COLUMNS,
                            na_values='?')

raw_eval_data.iloc[prediction_input.index]
age workclass fnlwgt education education_num marital_status occupation relationship race gender capital_gain capital_loss hours_per_week native_country income_bracket
1979 51 Local-gov 99064 Masters 14 Married-civ-spouse Prof-specialty Husband White Male 0 0 35 United-States <=50K
2430 26 Private 197967 HS-grad 9 Never-married Craft-repair Own-child White Male 0 0 40 United-States <=50K
4214 22 Private 221694 Some-college 10 Never-married Protective-serv Not-in-family White Male 0 0 60 United-States <=50K
10389 30 Private 96480 Assoc-voc 11 Never-married Adm-clerical Own-child White Female 0 0 32 United-States <=50K
14525 18 Private 146225 10th 6 Never-married Other-service Own-child Amer-Indian-Eskimo Female 0 0 40 United-States <=50K
15040 37 Self-emp-not-inc 50096 Assoc-voc 11 Married-civ-spouse Craft-repair Husband White Male 0 0 30 United-States <=50K
8409 42 Private 102988 Masters 14 Married-civ-spouse Prof-specialty Husband White Male 0 0 55 Ecuador >50K
10628 40 Local-gov 284086 Masters 14 Married-civ-spouse Prof-specialty Husband White Male 7688 0 40 United-States >50K
10942 44 Self-emp-not-inc 52505 Some-college 10 Married-civ-spouse Farming-fishing Husband White Male 0 0 99 United-States <=50K
5129 54 Private 106728 Bachelors 13 Married-civ-spouse Tech-support Husband White Male 0 0 40 United-States <=50K
2096 21 Private 190916 Some-college 10 Never-married Sales Own-child White Female 0 0 20 United-States <=50K
12463 29 Private 197565 HS-grad 9 Married-civ-spouse Other-service Wife White Female 0 1902 35 United-States >50K
8528 46 Private 193188 Masters 14 Never-married Exec-managerial Unmarried White Male 0 0 40 United-States <=50K
7093 20 Private 273147 HS-grad 9 Never-married Machine-op-inspct Own-child Black Male 0 0 40 United-States <=50K
12565 46 Private 203653 Bachelors 13 Divorced Sales Other-relative Black Male 0 0 40 United-States <=50K
5655 57 Private 174662 HS-grad 9 Married-civ-spouse Craft-repair Husband White Male 0 0 40 United-States <=50K
2322 48 Private 232149 Bachelors 13 Divorced Tech-support Own-child White Female 0 0 40 United-States <=50K
12652 39 Private 82521 Bachelors 13 Married-civ-spouse Sales Husband White Male 0 0 45 United-States >50K
4755 33 Private 330715 HS-grad 9 Married-civ-spouse Sales Husband White Male 0 0 55 United-States <=50K
4413 44 State-gov 128586 Bachelors 13 Never-married Farming-fishing Not-in-family White Male 0 0 40 United-States <=50K

將預測輸入內容匯出為以換行符號分隔的 JSON 檔案:

import json

with open('prediction_input.json', 'w') as json_file:
  for row in prediction_input.values.tolist():
    json.dump(row, json_file)
    json_file.write('\n')

結束 Python 解譯器 (exit()),並從您的殼層檢視 prediction_input.json

cat prediction_input.json
[0.9012127751273994, 1.0, 1.525541514460902, 2.0, 9.0, 0.0, 4.0, -0.14479173735784842, -0.21713186390175285, -0.43754385253479555, 38.0]
[-0.9221541171760282, 3.0, -0.4192650914017433, 4.0, 2.0, 3.0, 4.0, -0.14479173735784842, -0.21713186390175285, -0.03403923708700391, 38.0]
[-1.2138928199445767, 3.0, -0.030303770229214273, 4.0, 10.0, 1.0, 4.0, -0.14479173735784842, -0.21713186390175285, 1.5799792247041626, 38.0]
[-0.6304154144074798, 3.0, 0.35865755094331475, 4.0, 0.0, 3.0, 4.0, -0.14479173735784842, -0.21713186390175285, -0.6796466218034705, 38.0]
[-1.5056315227131252, 3.0, -1.5861490549193304, 4.0, 7.0, 3.0, 0.0, -0.14479173735784842, -0.21713186390175285, -0.03403923708700391, 38.0]
[-0.11987268456252011, 5.0, 0.35865755094331475, 2.0, 2.0, 0.0, 4.0, -0.14479173735784842, -0.21713186390175285, -0.8410484679825871, 38.0]
[0.24480069389816542, 3.0, 1.525541514460902, 2.0, 9.0, 0.0, 4.0, -0.14479173735784842, -0.21713186390175285, 1.176474609256371, 6.0]
[0.0989313425138912, 1.0, 1.525541514460902, 2.0, 9.0, 0.0, 4.0, 0.8868473744801746, -0.21713186390175285, -0.03403923708700391, 38.0]
[0.39067004528243965, 5.0, -0.030303770229214273, 2.0, 4.0, 0.0, 4.0, -0.14479173735784842, -0.21713186390175285, 4.7273152251969375, 38.0]
[1.1200168022038106, 3.0, 1.1365801932883728, 2.0, 12.0, 0.0, 4.0, -0.14479173735784842, -0.21713186390175285, -0.03403923708700391, 38.0]
[-1.2868274956367138, 3.0, -0.030303770229214273, 4.0, 11.0, 3.0, 4.0, -0.14479173735784842, -0.21713186390175285, -1.6480576988781703, 38.0]
[-0.7033500900996169, 3.0, -0.4192650914017433, 2.0, 7.0, 5.0, 4.0, -0.14479173735784842, 4.5022796885373735, -0.43754385253479555, 38.0]
[0.5365393966667138, 3.0, 1.525541514460902, 4.0, 3.0, 4.0, 4.0, -0.14479173735784842, -0.21713186390175285, -0.03403923708700391, 38.0]
[-1.3597621713288508, 3.0, -0.4192650914017433, 4.0, 6.0, 3.0, 2.0, -0.14479173735784842, -0.21713186390175285, -0.03403923708700391, 38.0]
[0.5365393966667138, 3.0, 1.1365801932883728, 0.0, 11.0, 2.0, 2.0, -0.14479173735784842, -0.21713186390175285, -0.03403923708700391, 38.0]
[1.338820829280222, 3.0, -0.4192650914017433, 2.0, 2.0, 0.0, 4.0, -0.14479173735784842, -0.21713186390175285, -0.03403923708700391, 38.0]
[0.6824087480509881, 3.0, 1.1365801932883728, 0.0, 12.0, 3.0, 4.0, -0.14479173735784842, -0.21713186390175285, -0.03403923708700391, 38.0]
[0.0259966668217541, 3.0, 1.1365801932883728, 2.0, 11.0, 0.0, 4.0, -0.14479173735784842, -0.21713186390175285, 0.3694653783607877, 38.0]
[-0.4116113873310685, 3.0, -0.4192650914017433, 2.0, 11.0, 0.0, 4.0, -0.14479173735784842, -0.21713186390175285, 1.176474609256371, 38.0]
[0.39067004528243965, 6.0, 1.1365801932883728, 4.0, 4.0, 1.0, 4.0, -0.14479173735784842, -0.21713186390175285, -0.03403923708700391, 38.0]

gcloud 指令列工具接受以換行符號分隔的 JSON 檔案來進行線上預測,且這個 Keras 模型會預期每個輸入範例的統一格式數字清單。

在不使用 gcloud 工具的情況下向 REST API 提出線上預測要求時,Cloud ML Engine 需要您提供其他格式。此外,用於預測的資料格式設定也可能因您建構模型架構的方式而改變。進一步瞭解如何格式化資料來進行線上預測

提交線上預測要求

使用 gcloud來提交您的線上預測要求:

gcloud ml-engine predict \
  --model $MODEL_NAME \
  --version $MODEL_VERSION \
  --json-instances prediction_input.json
DENSE_4
[0.6854287385940552]
[0.011786997318267822]
[0.037236183881759644]
[0.016223609447479248]
[0.0012015104293823242]
[0.23621389269828796]
[0.6174039244651794]
[0.9822691679000854]
[0.3815768361091614]
[0.6715215444564819]
[0.001094043254852295]
[0.43077391386032104]
[0.22132840752601624]
[0.004075437784194946]
[0.22736871242523193]
[0.4111979305744171]
[0.27328649163246155]
[0.6981356143951416]
[0.3309604525566101]
[0.20807647705078125]

由於模型的最後一層是使用 sigmoid 函式來進行活化程序,因此介於 0 到 0.5 之間的輸出值代表負向預測 (「<=50K」),而 0.5 到 1 之間的輸出值則代表正向預測 (「>50K」)。

從頭開始開發 Keras 模型

到目前為止,您已經在 Cloud ML Engine 上完成機器學習模型的訓練、在 Cloud ML Engine 上將經過訓練的模型部署為版本資源,以及透過該部署進行線上預測。下一節將逐步說明如何重建 Keras 程式碼以用於訓練模型,內容包括以下這些開發機器學習模型以用於 Cloud ML Engine 的相關工作:

  • 下載和預先處理資料
  • 設計和訓練模型
  • 視覺化呈現及匯出經過訓練的模型

本節將針對在前幾個步驟中完成的工作提供詳細的深入分析,如需 tf.keras 使用方式詳細資訊,請參閱 TensorFlow 的 Keras 指南。如要進一步瞭解如何將程式碼架構為適用 Cloud ML Engine 的訓練套件,請參閱「封裝訓練用程式」並參照完整的訓練程式碼 (採用 Python 套件結構)。

匯入程式庫並定義常數

請先匯入訓練需要的 Python 程式庫:

import os
from six.moves import urllib
import tempfile

import numpy as np
import pandas as pd
import tensorflow as tf

# Examine software versions
print(__import__('sys').version)
print(tf.__version__)
print(tf.keras.__version__)

接著請定義一些實用的常數:

  • 有關下載訓練及評估資料的資訊
  • 可供 Pandas 解譯資料並將類別型欄位轉換為數值特徵的必要資訊
  • 用於訓練的超參數,例如學習率和批次大小
### For downloading data ###

# Storage directory
DATA_DIR = os.path.join(tempfile.gettempdir(), 'census_data')

# Download options.
DATA_URL = 'https://storage.googleapis.com/cloud-samples-data/ml-engine' \
           '/census/data'
TRAINING_FILE = 'adult.data.csv'
EVAL_FILE = 'adult.test.csv'
TRAINING_URL = '%s/%s' % (DATA_URL, TRAINING_FILE)
EVAL_URL = '%s/%s' % (DATA_URL, EVAL_FILE)

### For interpreting data ###

# These are the features in the dataset.
# Dataset information: https://archive.ics.uci.edu/ml/datasets/census+income
_CSV_COLUMNS = [
    'age', 'workclass', 'fnlwgt', 'education', 'education_num',
    'marital_status', 'occupation', 'relationship', 'race', 'gender',
    'capital_gain', 'capital_loss', 'hours_per_week', 'native_country',
    'income_bracket'
]

_CATEGORICAL_TYPES = {
  'workclass': pd.api.types.CategoricalDtype(categories=[
    'Federal-gov', 'Local-gov', 'Never-worked', 'Private', 'Self-emp-inc',
    'Self-emp-not-inc', 'State-gov', 'Without-pay'
  ]),
  'marital_status': pd.api.types.CategoricalDtype(categories=[
    'Divorced', 'Married-AF-spouse', 'Married-civ-spouse',
    'Married-spouse-absent', 'Never-married', 'Separated', 'Widowed'
  ]),
  'occupation': pd.api.types.CategoricalDtype([
    'Adm-clerical', 'Armed-Forces', 'Craft-repair', 'Exec-managerial',
    'Farming-fishing', 'Handlers-cleaners', 'Machine-op-inspct',
    'Other-service', 'Priv-house-serv', 'Prof-specialty', 'Protective-serv',
    'Sales', 'Tech-support', 'Transport-moving'
  ]),
  'relationship': pd.api.types.CategoricalDtype(categories=[
    'Husband', 'Not-in-family', 'Other-relative', 'Own-child', 'Unmarried',
    'Wife'
  ]),
  'race': pd.api.types.CategoricalDtype(categories=[
    'Amer-Indian-Eskimo', 'Asian-Pac-Islander', 'Black', 'Other', 'White'
  ]),
  'native_country': pd.api.types.CategoricalDtype(categories=[
    'Cambodia', 'Canada', 'China', 'Columbia', 'Cuba', 'Dominican-Republic',
    'Ecuador', 'El-Salvador', 'England', 'France', 'Germany', 'Greece',
    'Guatemala', 'Haiti', 'Holand-Netherlands', 'Honduras', 'Hong', 'Hungary',
    'India', 'Iran', 'Ireland', 'Italy', 'Jamaica', 'Japan', 'Laos', 'Mexico',
    'Nicaragua', 'Outlying-US(Guam-USVI-etc)', 'Peru', 'Philippines', 'Poland',
    'Portugal', 'Puerto-Rico', 'Scotland', 'South', 'Taiwan', 'Thailand',
    'Trinadad&Tobago', 'United-States', 'Vietnam', 'Yugoslavia'
  ]),
  'income_bracket': pd.api.types.CategoricalDtype(categories=[
    '<=50K', '>50K'
  ])
}

# This is the label (target) we want to predict.
_LABEL_COLUMN = 'income_bracket'

### Hyperparameters for training ###

# This the training batch size
BATCH_SIZE = 128

# This is the number of epochs (passes over the full training data)
NUM_EPOCHS = 20

# Define learning rate.
LEARNING_RATE = .01

下載及預先處理資料

下載資料

接下來,請定義用來下載訓練及評估資料的函式。這些函式也要修正資料格式設定中的輕微違規問題。

def _download_and_clean_file(filename, url):
  """Downloads data from url, and makes changes to match the CSV format.

  The CSVs may use spaces after the comma delimters (non-standard) or include
  rows which do not represent well-formed examples. This function strips out
  some of these problems.

  Args:
    filename: filename to save url to
    url: URL of resource to download
  """
  temp_file, _ = urllib.request.urlretrieve(url)
  with tf.gfile.Open(temp_file, 'r') as temp_file_object:
    with tf.gfile.Open(filename, 'w') as file_object:
      for line in temp_file_object:
        line = line.strip()
        line = line.replace(', ', ',')
        if not line or ',' not in line:
          continue
        if line[-1] == '.':
          line = line[:-1]
        line += '\n'
        file_object.write(line)
  tf.gfile.Remove(temp_file)

def download(data_dir):
  """Downloads census data if it is not already present.

  Args:
    data_dir: directory where we will access/save the census data
  """
  tf.gfile.MakeDirs(data_dir)

  training_file_path = os.path.join(data_dir, TRAINING_FILE)
  if not tf.gfile.Exists(training_file_path):
    _download_and_clean_file(training_file_path, TRAINING_URL)

  eval_file_path = os.path.join(data_dir, EVAL_FILE)
  if not tf.gfile.Exists(eval_file_path):
    _download_and_clean_file(eval_file_path, EVAL_URL)

  return training_file_path, eval_file_path

請使用這些函式來下載訓練資料,並確認您具有用於訓練和評估的 CSV 檔案:

training_file_path, eval_file_path = download(DATA_DIR)

接著使用 Pandas 載入這些檔案並檢視資料:

# This census data uses the value '?' for fields (column) that are missing data.
# We use na_values to find ? and set it to NaN values.
# https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html

train_df = pd.read_csv(training_file_path, names=_CSV_COLUMNS, na_values='?')
eval_df = pd.read_csv(eval_file_path, names=_CSV_COLUMNS, na_values='?')

下表摘錄預先處理之前的資料 (train_df.head()):

age workclass fnlwgt education education_num marital_status occupation relationship race gender capital_gain capital_loss hours_per_week native_country income_bracket
0 39 State-gov 77516 Bachelors 13 Never-married Adm-clerical Not-in-family White Male 2174 0 40 United-States <=50K
1 50 Self-emp-not-inc 83311 Bachelors 13 Married-civ-spouse Exec-managerial Husband White Male 0 0 13 United-States <=50K
2 38 Private 215646 HS-grad 9 Divorced Handlers-cleaners Not-in-family White Male 0 0 40 United-States <=50K
3 53 Private 234721 11th 7 Married-civ-spouse Handlers-cleaners Husband Black Male 0 0 40 United-States <=50K
4 28 Private 338409 Bachelors 13 Married-civ-spouse Prof-specialty Wife Black Female 0 0 40 Cuba <=50K

預先處理資料

預先處理的第一個步驟是將特定特徵從資料中移除,並將類別型特徵轉換為數值,以供 Keras 使用。

進一步瞭解特徵工程資料偏見

UNUSED_COLUMNS = ['fnlwgt', 'education', 'gender']

def preprocess(dataframe):
  """Converts categorical features to numeric. Removes unused columns.

  Args:
    dataframe: Pandas dataframe with raw data

  Returns:
    Dataframe with preprocessed data
  """
  dataframe = dataframe.drop(columns=UNUSED_COLUMNS)

  # Convert integer valued (numeric) columns to floating point
  numeric_columns = dataframe.select_dtypes(['int64']).columns
  dataframe[numeric_columns] = dataframe[numeric_columns].astype('float32')

  # Convert categorical columns to numeric
  cat_columns = dataframe.select_dtypes(['object']).columns
  dataframe[cat_columns] = dataframe[cat_columns].apply(lambda x: x.astype(
    _CATEGORICAL_TYPES[x.name]))
  dataframe[cat_columns] = dataframe[cat_columns].apply(lambda x: x.cat.codes)
  return dataframe

prepped_train_df = preprocess(train_df)
prepped_eval_df = preprocess(eval_df)

下表 (prepped_train_df.head()) 顯示進行預先處理作業後的資料變化。請特別注意 income_bracket,也就是您訓練模型預測的標籤已從 <=50K>50K 變更為 01

age workclass education_num marital_status occupation relationship race capital_gain capital_loss hours_per_week native_country income_bracket
0 39.0 6 13.0 4 0 1 4 2174.0 0.0 40.0 38 0
1 50.0 5 13.0 2 3 0 4 0.0 0.0 13.0 38 0
2 38.0 3 9.0 0 5 1 4 0.0 0.0 40.0 38 0
3 53.0 3 7.0 2 5 0 2 0.0 0.0 40.0 38 0
4 28.0 3 13.0 2 9 5 2 0.0 0.0 40.0 4 0

接著,請將資料區隔為特徵 (「x」) 和標籤 (「y」),然後將標籤陣列重整為適當格式,以便 tf.data.Dataset 稍後使用:

# Split train and test data with labels.
# The pop() method will extract (copy) and remove the label column from the dataframe
train_x, train_y = prepped_train_df, prepped_train_df.pop(_LABEL_COLUMN)
eval_x, eval_y = prepped_eval_df, prepped_eval_df.pop(_LABEL_COLUMN)

# Reshape label columns for use with tf.data.Dataset
train_y = np.asarray(train_y).astype('float32').reshape((-1, 1))
eval_y = np.asarray(eval_y).astype('float32').reshape((-1, 1))

按比例調整訓練資料,讓數值特徵欄的平均值為 0、標準差為 1,這樣有助改善模型

在實際運作系統中,您可能會想儲存訓練集的平均值和標準差,並在預測期間用這些資料對測試資料進行完全相同的轉換。為便於本練習範例使用,我們暫時先合併訓練及評估資料來進行資料調度:

def standardize(dataframe):
  """Scales numerical columns using their means and standard deviation to get
  z-scores: the mean of each numerical column becomes 0, and the standard
  deviation becomes 1. This can help the model converge during training.

  Args:
    dataframe: Pandas dataframe

  Returns:
    Input dataframe with the numerical columns scaled to z-scores
  """
  dtypes = list(zip(dataframe.dtypes.index, map(str, dataframe.dtypes)))
  # Normalize numeric columns.
  for column, dtype in dtypes:
      if dtype == 'float32':
          dataframe[column] -= dataframe[column].mean()
          dataframe[column] /= dataframe[column].std()
  return dataframe

# Join train_x and eval_x to normalize on overall means and standard
# deviations. Then separate them again.
all_x = pd.concat([train_x, eval_x], keys=['train', 'eval'])
all_x = standardize(all_x)
train_x, eval_x = all_x.xs('train'), all_x.xs('eval')

下表 (train_x.head()) 顯示已經過完整預先處理的資料格式:

age workclass education_num marital_status occupation relationship race capital_gain capital_loss hours_per_week native_country
0 0.025997 6 1.136580 4 0 1 4 0.146933 -0.217132 -0.034039 38
1 0.828278 5 1.136580 2 3 0 4 -0.144792 -0.217132 -2.212964 38
2 -0.046938 3 -0.419265 0 5 1 4 -0.144792 -0.217132 -0.034039 38
3 1.047082 3 -1.197188 2 5 0 2 -0.144792 -0.217132 -0.034039 38
4 -0.776285 3 1.136580 2 9 5 2 -0.144792 -0.217132 -0.034039 4

設計和訓練模型

建立訓練及驗證資料集

建立輸入函式,以將特徵和標籤轉換為 tf.data.Dataset 來進行訓練或評估:

def input_fn(features, labels, shuffle, num_epochs, batch_size):
  """Generates an input function to be used for model training.

  Args:
    features: numpy array of features used for training or inference
    labels: numpy array of labels for each example
    shuffle: boolean for whether to shuffle the data or not (set True for
      training, False for evaluation)
    num_epochs: number of epochs to provide the data for
    batch_size: batch size for training

  Returns:
    A tf.data.Dataset that can provide data to the Keras model for training or
      evaluation
  """
  if labels is None:
    inputs = features
  else:
    inputs = (features, labels)
  dataset = tf.data.Dataset.from_tensor_slices(inputs)

  if shuffle:
    dataset = dataset.shuffle(buffer_size=len(features))

  # We call repeat after shuffling, rather than before, to prevent separate
  # epochs from blending together.
  dataset = dataset.repeat(num_epochs)
  dataset = dataset.batch(batch_size)
  return dataset

接著請建立這些訓練和評估資料集。使用先前定義的 NUM_EPOCHSBATCH_SIZE 超參數來定義訓練資料集如何在訓練期間為模型提供範例。針對每個訓練週期結束時的單一驗證步驟,設定驗證資料集用同一個批次提供所有範例。

# Pass a numpy array by using DataFrame.values
training_dataset = input_fn(features=train_x.values,
                    labels=train_y,
                    shuffle=True,
                    num_epochs=NUM_EPOCHS,
                    batch_size=BATCH_SIZE)

num_eval_examples = eval_x.shape[0]

# Pass a numpy array by using DataFrame.values
validation_dataset = input_fn(features=eval_x.values,
                    labels=eval_y,
                    shuffle=False,
                    num_epochs=NUM_EPOCHS,
                    batch_size=num_eval_examples)

設計 Keras 模型

使用 Keras Sequential API 來設計您的類神經網路。

這個深層類神經網路 (DNN) 有好幾個隱藏層,且最後一層會使用 S 活化函數來輸出介於 0 到 1 之間的值:

  • 輸入層有 100 個使用 ReLU 活化函式的單位。
  • 隱藏層有 75 個使用 ReLU 活化函式的單位。
  • 隱藏層有 50 個使用 ReLU 活化函式的單位。
  • 隱藏層有 25 個使用 ReLU 活化函式的單位。
  • 輸出層有 1 個使用 s 活化函式的單位。
  • 最佳化器會使用二元交叉熵損失函式,適用於這裡所述的二元分類問題。

您隨時都可以變更這些層來改善模型:

def create_keras_model(input_dim, learning_rate):
  """Creates Keras Model for Binary Classification.

  Args:
    input_dim: How many features the input has
    learning_rate: Learning rate for training

  Returns:
    The compiled Keras model (still needs to be trained)
  """
  model = tf.keras.Sequential()
  model.add(
      tf.keras.layers.Dense(
          100,
          activation=tf.nn.relu,
          kernel_initializer='uniform',
          input_shape=(input_dim,)))
  model.add(tf.keras.layers.Dense(75, activation=tf.nn.relu))
  model.add(tf.keras.layers.Dense(50, activation=tf.nn.relu))
  model.add(tf.keras.layers.Dense(25, activation=tf.nn.relu))
  # The single output node and Sigmoid activation makes this a Logistic
  # Regression.
  model.add(tf.keras.layers.Dense(1, activation=tf.nn.sigmoid))

  # Custom Optimizer:
  # https://www.tensorflow.org/api_docs/python/tf/train/RMSPropOptimizer
  optimizer = tf.keras.optimizers.RMSprop(
      lr=learning_rate, rho=0.9, epsilon=1e-08, decay=0.0)

  # Compile Keras model
  model.compile(
      loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
  return model

接下來要建立 Keras 模型物件:

num_train_examples, input_dim = train_x.shape
print('Number of features: {}'.format(input_dim))
print('Number of examples: {}'.format(num_train_examples))

keras_model = create_keras_model(
    input_dim=input_dim,
    learning_rate=LEARNING_RATE)

使用 keras_model.summary() 檢視模型應會傳回如下所示:

Number of features: 11
Number of examples: 32561
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/resource_variable_ops.py:435: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
dense (Dense)                (None, 100)               1200
_________________________________________________________________
dense_1 (Dense)              (None, 75)                7575
_________________________________________________________________
dense_2 (Dense)              (None, 50)                3800
_________________________________________________________________
dense_3 (Dense)              (None, 25)                1275
_________________________________________________________________
dense_4 (Dense)              (None, 1)                 26
=================================================================
Total params: 13,876
Trainable params: 13,876
Non-trainable params: 0
_________________________________________________________________

訓練和評估模型

定義學習率衰減,鼓勵模型參數隨著訓練進展做微幅的變更:

# Setup Learning Rate decay.
lr_decay = tf.keras.callbacks.LearningRateScheduler(
    lambda epoch: LEARNING_RATE + 0.02 * (0.5 ** (1 + epoch)),
    verbose=True)

最後我們要來訓練模型。為模型提供適當的 steps_per_epoch,以便在每個訓練週期對整個訓練資料集 (每步驟需要 BATCH_SIZE 個範例) 進行訓練工作。然後指示模型在每個訓練週期結束時,透過一個較大的驗證批次來計算驗證準確率。

history = keras_model.fit(training_dataset,
                          epochs=NUM_EPOCHS,
                          steps_per_epoch=int(num_train_examples/BATCH_SIZE),
                          validation_data=validation_dataset,
                          validation_steps=1,
                          callbacks=[lr_decay],
                          verbose=1)

訓練進度看起來可能如下所示:

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.

Epoch 00001: LearningRateScheduler reducing learning rate to 0.02.
Epoch 1/20
254/254 [==============================] - 1s 5ms/step - loss: 0.6986 - acc: 0.7893 - val_loss: 0.3894 - val_acc: 0.8329

Epoch 00002: LearningRateScheduler reducing learning rate to 0.015.
Epoch 2/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3574 - acc: 0.8335 - val_loss: 0.3861 - val_acc: 0.8131

...

Epoch 00019: LearningRateScheduler reducing learning rate to 0.010000038146972657.
Epoch 19/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3239 - acc: 0.8512 - val_loss: 0.3334 - val_acc: 0.8496

Epoch 00020: LearningRateScheduler reducing learning rate to 0.010000019073486329.
Epoch 20/20
254/254 [==============================] - 1s 4ms/step - loss: 0.3279 - acc: 0.8504 - val_loss: 0.3174 - val_acc: 0.8523

視覺化呈現訓練工作及匯出訓練模型

視覺化呈現訓練工作

匯入 matplotlib 能以視覺化呈現模型在訓練期間的學習過程 (如有需要,請先使用 pip install matplotlib 進行安裝)。

from matplotlib import pyplot as plt

描繪模型在每個訓練週期結束時計算所得的損失 (二元交叉熵) 和準確率:

# Visualize History for Loss.
plt.title('Keras model loss')
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['training', 'validation'], loc='upper right')
plt.show()

# Visualize History for Accuracy.
plt.title('Keras model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.legend(['training', 'validation'], loc='lower right')
plt.show()

損失率會隨著時間降低,而準確率則是逐漸上升;但這種情況會收斂至一個穩定水平嗎?訓練和驗證指標之間是否有巨大差異 (過度調適的跡象)?

瞭解如何改善您的機器學習模型。接著您可以調整超參數或模型架構,然後再次進行訓練。

匯出模型以供使用

使用 tf.contrib.saved_model.save_keras_model 匯出 TensorFlow SavedModel 目錄。這個格式與您建立模型版本資源時,Cloud ML Engine 要求您使用的格式相同。

由於並非所有最佳化器都能匯出為 SavedModel 格式,因此匯出程序期間可能會顯示警告訊息。不過只要您能成功匯出有用的圖形,Cloud ML Engine 就可以使用 SavedModel 來進行預測。

# Export the model to a local SavedModel directory
export_path = tf.contrib.saved_model.save_keras_model(keras_model, 'keras_export')
print("Model exported to: ", export_path)
WARNING: The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
If you depend on functionality not listed there, please file an issue.

WARNING:tensorflow:This model was compiled with a Keras optimizer (<tensorflow.python.keras.optimizers.RMSprop object at 0x7fc198c4e400>) but is being saved in TensorFlow format with `save_weights`. The model's weights will be saved, but unlike with TensorFlow optimizers in the TensorFlow format the optimizer's state will not be saved.

Consider using a TensorFlow optimizer from `tf.train`.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/network.py:1436: update_checkpoint_state (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.train.CheckpointManager to manage checkpoints rather than manually editing the Checkpoint proto.
WARNING:tensorflow:Model was compiled with an optimizer, but the optimizer is not from `tf.train` (e.g. `tf.train.AdagradOptimizer`). Only the serving graph was exported. The train and evaluate graphs were not added to the SavedModel.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/saved_model/signature_def_utils_impl.py:205: build_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.build_tensor_info or tf.compat.v1.saved_model.build_tensor_info.
INFO:tensorflow:Signatures INCLUDED in export for Classify: None
INFO:tensorflow:Signatures INCLUDED in export for Regress: None
INFO:tensorflow:Signatures INCLUDED in export for Predict: ['serving_default']
INFO:tensorflow:Signatures INCLUDED in export for Train: None
INFO:tensorflow:Signatures INCLUDED in export for Eval: None
INFO:tensorflow:No assets to save.
INFO:tensorflow:No assets to write.
INFO:tensorflow:SavedModel written to: keras_export/1553710367/saved_model.pb
Model exported to:  b'keras_export/1553710367'

只要具備必要權限,您也可以將 SavedModel 目錄匯出到您的本機檔案系統或 Cloud Storage。在您的目前環境中,您已藉由驗證 GCP 帳戶及設定 GOOGLE_APPLICATION_CREDENTIALS 環境來授予 Cloud Storage 存取權。Cloud ML Engine 訓練工作也可直接匯出至 Cloud Storage,因為 Cloud ML Engine 服務帳戶本身的專案已具有 Cloud Storage 值區的存取權

試著直接匯出至 Cloud Storage:

JOB_DIR = os.getenv('JOB_DIR')

# Export the model to a SavedModel directory in Cloud Storage
export_path = tf.contrib.saved_model.save_keras_model(keras_model, JOB_DIR + '/keras_export')
print("Model exported to: ", export_path)
WARNING:tensorflow:This model was compiled with a Keras optimizer (<tensorflow.python.keras.optimizers.RMSprop object at 0x7fc198c4e400>) but is being saved in TensorFlow format with `save_weights`. The model's weights will be saved, but unlike with TensorFlow optimizers in the TensorFlow format the optimizer's state will not be saved.

Consider using a TensorFlow optimizer from `tf.train`.
WARNING:tensorflow:Model was compiled with an optimizer, but the optimizer is not from `tf.train` (e.g. `tf.train.AdagradOptimizer`). Only the serving graph was exported. The train and evaluate graphs were not added to the SavedModel.
INFO:tensorflow:Signatures INCLUDED in export for Classify: None
INFO:tensorflow:Signatures INCLUDED in export for Regress: None
INFO:tensorflow:Signatures INCLUDED in export for Predict: ['serving_default']
INFO:tensorflow:Signatures INCLUDED in export for Train: None
INFO:tensorflow:Signatures INCLUDED in export for Eval: None
INFO:tensorflow:No assets to save.
INFO:tensorflow:No assets to write.
INFO:tensorflow:SavedModel written to: gs://your-bucket-name/keras-job-dir/keras_export/1553710379/saved_model.pb
Model exported to:  b'gs://your-bucket-name/keras-job-dir/keras_export/1553710379'

您現在可將這個模型部署到 Cloud ML Engine 並開始進行預測,請按照預測快速指南中的步驟操作。

清除資源

如要清除您在本專案中使用的所有 GCP 資源,請將您在本教學課程中使用的 GCP 專案刪除

或者,您也可執行以下指令來清除個別資源:

# Delete model version resource
gcloud ml-engine versions delete $MODEL_VERSION --quiet --model $MODEL_NAME

# Delete model resource
gcloud ml-engine models delete $MODEL_NAME --quiet

# Delete Cloud Storage objects that were created
gsutil -m rm -r $JOB_DIR

# If training job is still running, cancel it
gcloud ml-engine jobs cancel $JOB_NAME --quiet

如果您的 Cloud Storage 值區未包含任何其他物件,且您想要將值區刪除,請執行 gsutil rm -r gs://$BUCKET_NAME

後續步驟

  • 查看在本指南中使用的完整訓練程式碼,其建構的程式碼架構可接受自訂為指令列標記形式的超參數。
  • 請參閱如何為 Cloud ML Engine 訓練工作封裝程式碼
  • 請參閱如何部署模型以供進行預測。
本頁內容對您是否有任何幫助?請提供意見:

傳送您對下列選項的寶貴意見...

這個網頁
TensorFlow 的 AI Platform