Run this tutorial as a notebook in Colab | View the notebook on GitHub |
This tutorial shows how to train a neural network on AI Platform using the Keras sequential API and how to serve predictions from that model.
Keras is a high-level API for building and training deep learning models. tf.keras is TensorFlow’s implementation of this API.
The first two parts of the tutorial walk through training a model on AI Platform using prewritten Keras code, deploying the trained model to AI Platform, and serving online predictions from the deployed model.
The last part of the tutorial digs into the training code used for this model and ensuring it's compatible with AI Platform. To learn more about building machine learning models in Keras more generally, read TensorFlow's Keras tutorials.
Dataset
This tutorial uses the United States Census Income Dataset provided by the UC Irvine Machine Learning Repository. This dataset contains information about people from a 1994 Census database, including age, education, marital status, occupation, and whether they make more than $50,000 a year.
Objective
The goal is to train a deep neural network (DNN) using Keras that predicts whether a person makes more than $50,000 a year (target label) based on other Census information about the person (features).
This tutorial focuses more on using this model with AI Platform than on the design of the model itself. However, it's always important to think about potential problems and unintended consequences when building machine learning systems. See the Machine Learning Crash Course exercise about fairness to learn about sources of bias in the Census dataset, as well as machine learning fairness more generally.
Costs
This tutorial uses billable components of Google Cloud (Google Cloud):
- AI Platform Training
- AI Platform Prediction
- Cloud Storage
Learn about AI Platform Training pricing, AI Platform Prediction pricing, and Cloud Storage pricing, and use the Pricing Calculator to generate a cost estimate based on your projected usage.
Before you begin
You must do several things before you can train and deploy a model in AI Platform:
- Set up your local development environment.
- Set up a Google Cloud project with billing and the necessary APIs enabled.
- Create a Cloud Storage bucket to store your training package and your trained model.
Set up your local development environment
You need the following to complete this tutorial:
- Git
- Python 3
- virtualenv
- The Google Cloud SDK
The Google Cloud guide to Setting up a Python development environment provides detailed instructions for meeting these requirements. The following steps provide a condensed set of instructions:
Install virtualenv and create a virtual environment that uses Python 3.
Activate that environment.
Complete the steps in the following section to install the Google Cloud SDK.
Set up your Google Cloud project
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Google Cloud project.
-
Enable the AI Platform Training & Prediction and Compute Engine APIs.
- Install the Google Cloud CLI.
-
To initialize the gcloud CLI, run the following command:
gcloud init
-
In the Google Cloud console, on the project selector page, select or create a Google Cloud project.
-
Make sure that billing is enabled for your Google Cloud project.
-
Enable the AI Platform Training & Prediction and Compute Engine APIs.
- Install the Google Cloud CLI.
-
To initialize the gcloud CLI, run the following command:
gcloud init
Authenticate your GCP account
To set up authentication, you need to create a service account key and set an environment variable for the file path to the service account key.
-
Create a service account:
-
In the Google Cloud console, go to the Create service account page.
- In the Service account name field, enter a name.
- Optional: In the Service account description field, enter a description.
- Click Create.
- Click the Select a role field. Under All roles, select AI Platform > AI Platform Admin.
- Click Add another role.
-
Click the Select a role field. Under All roles, select Storage > Storage Object Admin.
-
Click Done to create the service account.
Do not close your browser window. You will use it in the next step.
-
-
Create a service account key for authentication:
- In the Google Cloud console, click the email address for the service account that you created.
- Click Keys.
- Click Add key, then Create new key.
- Click Create. A JSON key file is downloaded to your computer.
- Click Close.
-
Set the environment variable GOOGLE_APPLICATION_CREDENTIALS to the file path of the JSON file that contains your service account key. This variable only applies to your current shell session, so if you open a new session, set the variable again.
Create a Cloud Storage bucket
When you submit a training job using the Cloud SDK, you upload a Python package containing your training code to a Cloud Storage bucket. AI Platform runs the code from this package. In this tutorial, AI Platform also saves the trained model that results from your job in the same bucket. You can then create an AI Platform model version based on this output in order to serve online predictions.
Set the name of your Cloud Storage bucket as an environment variable. It must be unique across all Cloud Storage buckets:
BUCKET_NAME="your-bucket-name"
Select a region where AI Platform Training and AI Platform Prediction are available, and create another environment variable. For example:
REGION="us-central1"
Create your Cloud Storage bucket in this region and, later, use the same region for training and prediction. Run the following command to create the bucket if it doesn't already exist:
gsutil mb -l $REGION gs://$BUCKET_NAME
Quickstart for training in AI Platform
This section of the tutorial walks you through submitting a training job to AI Platform. This job runs sample code that uses Keras to train a deep neural network on the United States Census data. It outputs the trained model as a TensorFlow SavedModel directory in your Cloud Storage bucket.
Get training code and dependencies
First, download the training code and change the working directory:
# Clone the repository of AI Platform samples
git clone --depth 1 https://github.com/GoogleCloudPlatform/cloudml-samples
# Set the working directory to the sample code directory
cd cloudml-samples/census/tf-keras
Notice that the training code is structured as a Python package in the
trainer/
subdirectory:
# `ls` shows the working directory's contents. The `p` flag adds trailing
# slashes to subdirectory names. The `R` flag lists subdirectories recursively.
ls -pR
.: README.md requirements.txt trainer/ ./trainer: __init__.py model.py task.py util.py
Next, install Python dependencies needed to train the model locally:
pip install -r requirements.txt
When you run the training job in AI Platform, dependencies are preinstalled based on the runtime version you choose.
Train your model locally
Before training on AI Platform, train the job locally to verify the file structure and packaging is correct.
For a complex or resource-intensive job, you may want to train locally on a small sample of your dataset to verify your code. Then you can run the job on AI Platform to train on the whole dataset.
This sample runs a relatively quick job on a small dataset, so the local training and the AI Platform job run the same code on the same data.
Run the following command to train a model locally:
# This is similar to `python -m trainer.task --job-dir local-training-output`
# but it better replicates the AI Platform environment, especially
# for distributed training (not applicable here).
gcloud ai-platform local train \
--package-path trainer \
--module-name trainer.task \
--job-dir local-training-output
Observe training progress in your shell. At the end, the training application exports the trained model and prints a message like the following:
Model exported to: local-training-output/keras_export/1553709223
Train your model using AI Platform
Next, submit a training job to AI Platform. This runs the training module in the cloud and exports the trained model to Cloud Storage.
First, give your training job a name and choose a directory within your Cloud Storage bucket for saving intermediate and output files. Set these as environment variables. For example:
JOB_NAME="my_first_keras_job"
JOB_DIR="gs://$BUCKET_NAME/keras-job-dir"
Run the following command to package the trainer/
directory, upload it to the
specified --job-dir
, and instruct AI Platform to run the
trainer.task
module from that package.
The --stream-logs
flag lets you view training logs in your shell. You can also
see logs and other job details in the Google Cloud Console.
gcloud ai-platform jobs submit training $JOB_NAME \
--package-path trainer/ \
--module-name trainer.task \
--region $REGION \
--python-version 3.7 \
--runtime-version 1.15 \
--job-dir $JOB_DIR \
--stream-logs
This may take longer than local training, but you can observe training progress in your shell in a similar fashion. At the end, the training job exports the trained model to your Cloud Storage bucket and prints a message like the following:
INFO 2019-03-27 17:57:11 +0000 master-replica-0 Model exported to: gs://your-bucket-name/keras-job-dir/keras_export/1553709421 INFO 2019-03-27 17:57:11 +0000 master-replica-0 Module completed; cleaning up. INFO 2019-03-27 17:57:11 +0000 master-replica-0 Clean up finished. INFO 2019-03-27 17:57:11 +0000 master-replica-0 Task completed successfully.
Hyperparameter tuning
You can optionally perform hyperparameter tuning by using the included
hptuning_config.yaml
configuration file. This file tells
AI Platform to tune the batch size and learning rate for training
over multiple trials to maximize accuracy.
In this example, the training code uses a TensorBoard
callback,
which creates TensorFlow Summary
Event
s
during training. AI Platform uses these events to track the metric
you want to optimize. Learn more about hyperparameter tuning in
AI Platform Training.
gcloud ai-platform jobs submit training ${JOB_NAME}_hpt \
--config hptuning_config.yaml \
--package-path trainer/ \
--module-name trainer.task \
--region $REGION \
--python-version 3.7 \
--runtime-version 1.15 \
--job-dir $JOB_DIR \
--stream-logs
Quickstart for online predictions in AI Platform
This section shows how to use AI Platform and your trained model from the previous section to predict a person's income bracket from other Census information about them.
Create model and version resources in AI Platform
To serve online predictions using the model you trained and exported in the quickstart for training, create a model resource in AI Platform and a version resource within it. The version resource is what actually uses your trained model to serve predictions. This structure lets you adjust and retrain your model many times and organize all the versions together in AI Platform. Learn more about models and versions.
First, name and create the model resource:
MODEL_NAME="my_first_keras_model"
gcloud ai-platform models create $MODEL_NAME \
--regions $REGION
Created ml engine model [projects/your-project-id/models/my_first_keras_model].
Next, create the model version. The training job from the quickstart for training exported a timestamped TensorFlow SavedModel directory to your Cloud Storage bucket. AI Platform uses this directory to create a model version. Learn more about SavedModel and AI Platform.
You may be able to find the path to this directory in your training job's logs. Look for a line like:
Model exported to: gs://your-bucket-name/keras-job-dir/keras_export/1545439782
Execute the following command to identify your SavedModel directory and use it to create a model version resource:
MODEL_VERSION="v1"
# Get a list of directories in the `keras_export` parent directory. Then pick
# the directory with the latest timestamp, in case you've trained multiple
# times.
SAVED_MODEL_PATH=$(gsutil ls $JOB_DIR/keras_export | head -n 1)
# Create model version based on that SavedModel directory
gcloud ai-platform versions create $MODEL_VERSION \
--model $MODEL_NAME \
--region $REGION \
--runtime-version 1.15 \
--python-version 3.7 \
--framework tensorflow \
--origin $SAVED_MODEL_PATH
Prepare input for prediction
To receive valid and useful predictions, you must preprocess input for prediction in the same way that training data was preprocessed. In a production system, you may want to create a preprocessing pipeline that can be used identically at training time and prediction time.
For this exercise, use the training package's data-loading code to select a random sample from the evaluation data. This data is in the form that was used to evaluate accuracy after each epoch of training, so it can be used to send test predictions without further preprocessing.
Open the Python interpreter (python
) from your current working directory in
order to run the next several snippets of code:
from trainer import util
_, _, eval_x, eval_y = util.load_data()
prediction_input = eval_x.sample(20)
prediction_targets = eval_y[prediction_input.index]
prediction_input
age | workclass | education_num | marital_status | occupation | relationship | race | capital_gain | capital_loss | hours_per_week | native_country | |
---|---|---|---|---|---|---|---|---|---|---|---|
1979 | 0.901213 | 1 | 1.525542 | 2 | 9 | 0 | 4 | -0.144792 | -0.217132 | -0.437544 | 38 |
2430 | -0.922154 | 3 | -0.419265 | 4 | 2 | 3 | 4 | -0.144792 | -0.217132 | -0.034039 | 38 |
4214 | -1.213893 | 3 | -0.030304 | 4 | 10 | 1 | 4 | -0.144792 | -0.217132 | 1.579979 | 38 |
10389 | -0.630415 | 3 | 0.358658 | 4 | 0 | 3 | 4 | -0.144792 | -0.217132 | -0.679647 | 38 |
14525 | -1.505632 | 3 | -1.586149 | 4 | 7 | 3 | 0 | -0.144792 | -0.217132 | -0.034039 | 38 |
15040 | -0.119873 | 5 | 0.358658 | 2 | 2 | 0 | 4 | -0.144792 | -0.217132 | -0.841048 | 38 |
8409 | 0.244801 | 3 | 1.525542 | 2 | 9 | 0 | 4 | -0.144792 | -0.217132 | 1.176475 | 6 |
10628 | 0.098931 | 1 | 1.525542 | 2 | 9 | 0 | 4 | 0.886847 | -0.217132 | -0.034039 | 38 |
10942 | 0.390670 | 5 | -0.030304 | 2 | 4 | 0 | 4 | -0.144792 | -0.217132 | 4.727315 | 38 |
5129 | 1.120017 | 3 | 1.136580 | 2 | 12 | 0 | 4 | -0.144792 | -0.217132 | -0.034039 | 38 |
2096 | -1.286827 | 3 | -0.030304 | 4 | 11 | 3 | 4 | -0.144792 | -0.217132 | -1.648058 | 38 |
12463 | -0.703350 | 3 | -0.419265 | 2 | 7 | 5 | 4 | -0.144792 | 4.502280 | -0.437544 | 38 |
8528 | 0.536539 | 3 | 1.525542 | 4 | 3 | 4 | 4 | -0.144792 | -0.217132 | -0.034039 | 38 |
7093 | -1.359762 | 3 | -0.419265 | 4 | 6 | 3 | 2 | -0.144792 | -0.217132 | -0.034039 | 38 |
12565 | 0.536539 | 3 | 1.136580 | 0 | 11 | 2 | 2 | -0.144792 | -0.217132 | -0.034039 | 38 |
5655 | 1.338821 | 3 | -0.419265 | 2 | 2 | 0 | 4 | -0.144792 | -0.217132 | -0.034039 | 38 |
2322 | 0.682409 | 3 | 1.136580 | 0 | 12 | 3 | 4 | -0.144792 | -0.217132 | -0.034039 | 38 |
12652 | 0.025997 | 3 | 1.136580 | 2 | 11 | 0 | 4 | -0.144792 | -0.217132 | 0.369465 | 38 |
4755 | -0.411611 | 3 | -0.419265 | 2 | 11 | 0 | 4 | -0.144792 | -0.217132 | 1.176475 | 38 |
4413 | 0.390670 | 6 | 1.136580 | 4 | 4 | 1 | 4 | -0.144792 | -0.217132 | -0.034039 | 38 |
Notice that categorical fields, like occupation
, have already been converted
to integers (with the same mapping that was used for training). Numerical
fields, like age
, have been scaled to a
z-score.
Some fields have been dropped from the original data. Compare the prediction
input with the raw data for the same examples:
import pandas as pd
_, eval_file_path = util.download(util.DATA_DIR)
raw_eval_data = pd.read_csv(eval_file_path,
names=util._CSV_COLUMNS,
na_values='?')
raw_eval_data.iloc[prediction_input.index]
age | workclass | fnlwgt | education | education_num | marital_status | occupation | relationship | race | gender | capital_gain | capital_loss | hours_per_week | native_country | income_bracket | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1979 | 51 | Local-gov | 99064 | Masters | 14 | Married-civ-spouse | Prof-specialty | Husband | White | Male | 0 | 0 | 35 | United-States | <=50K |
2430 | 26 | Private | 197967 | HS-grad | 9 | Never-married | Craft-repair | Own-child | White | Male | 0 | 0 | 40 | United-States | <=50K |
4214 | 22 | Private | 221694 | Some-college | 10 | Never-married | Protective-serv | Not-in-family | White | Male | 0 | 0 | 60 | United-States | <=50K |
10389 | 30 | Private | 96480 | Assoc-voc | 11 | Never-married | Adm-clerical | Own-child | White | Female | 0 | 0 | 32 | United-States | <=50K |
14525 | 18 | Private | 146225 | 10th | 6 | Never-married | Other-service | Own-child | Amer-Indian-Eskimo | Female | 0 | 0 | 40 | United-States | <=50K |
15040 | 37 | Self-emp-not-inc | 50096 | Assoc-voc | 11 | Married-civ-spouse | Craft-repair | Husband | White | Male | 0 | 0 | 30 | United-States | <=50K |
8409 | 42 | Private | 102988 | Masters | 14 | Married-civ-spouse | Prof-specialty | Husband | White | Male | 0 | 0 | 55 | Ecuador | >50K |
10628 | 40 | Local-gov | 284086 | Masters | 14 | Married-civ-spouse | Prof-specialty | Husband | White | Male | 7688 | 0 | 40 | United-States | >50K |
10942 | 44 | Self-emp-not-inc | 52505 | Some-college | 10 | Married-civ-spouse | Farming-fishing | Husband | White | Male | 0 | 0 | 99 | United-States | <=50K |
5129 | 54 | Private | 106728 | Bachelors | 13 | Married-civ-spouse | Tech-support | Husband | White | Male | 0 | 0 | 40 | United-States | <=50K |
2096 | 21 | Private | 190916 | Some-college | 10 | Never-married | Sales | Own-child | White | Female | 0 | 0 | 20 | United-States | <=50K |
12463 | 29 | Private | 197565 | HS-grad | 9 | Married-civ-spouse | Other-service | Wife | White | Female | 0 | 1902 | 35 | United-States | >50K |
8528 | 46 | Private | 193188 | Masters | 14 | Never-married | Exec-managerial | Unmarried | White | Male | 0 | 0 | 40 | United-States | <=50K |
7093 | 20 | Private | 273147 | HS-grad | 9 | Never-married | Machine-op-inspct | Own-child | Black | Male | 0 | 0 | 40 | United-States | <=50K |
12565 | 46 | Private | 203653 | Bachelors | 13 | Divorced | Sales | Other-relative | Black | Male | 0 | 0 | 40 | United-States | <=50K |
5655 | 57 | Private | 174662 | HS-grad | 9 | Married-civ-spouse | Craft-repair | Husband | White | Male | 0 | 0 | 40 | United-States | <=50K |
2322 | 48 | Private | 232149 | Bachelors | 13 | Divorced | Tech-support | Own-child | White | Female | 0 | 0 | 40 | United-States | <=50K |
12652 | 39 | Private | 82521 | Bachelors | 13 | Married-civ-spouse | Sales | Husband | White | Male | 0 | 0 | 45 | United-States | >50K |
4755 | 33 | Private | 330715 | HS-grad | 9 | Married-civ-spouse | Sales | Husband | White | Male | 0 | 0 | 55 | United-States | <=50K |
4413 | 44 | State-gov | 128586 | Bachelors | 13 | Never-married | Farming-fishing | Not-in-family | White | Male | 0 | 0 | 40 | United-States | <=50K |
Export the prediction input to a newline-delimited JSON file:
import json
with open('prediction_input.json', 'w') as json_file:
for row in prediction_input.values.tolist():
json.dump(row, json_file)
json_file.write('\n')
Exit the Python interpreter (exit()
). From your shell, examine
prediction_input.json
:
cat prediction_input.json
[0.9012127751273994, 1.0, 1.525541514460902, 2.0, 9.0, 0.0, 4.0, -0.14479173735784842, -0.21713186390175285, -0.43754385253479555, 38.0] [-0.9221541171760282, 3.0, -0.4192650914017433, 4.0, 2.0, 3.0, 4.0, -0.14479173735784842, -0.21713186390175285, -0.03403923708700391, 38.0] [-1.2138928199445767, 3.0, -0.030303770229214273, 4.0, 10.0, 1.0, 4.0, -0.14479173735784842, -0.21713186390175285, 1.5799792247041626, 38.0] [-0.6304154144074798, 3.0, 0.35865755094331475, 4.0, 0.0, 3.0, 4.0, -0.14479173735784842, -0.21713186390175285, -0.6796466218034705, 38.0] [-1.5056315227131252, 3.0, -1.5861490549193304, 4.0, 7.0, 3.0, 0.0, -0.14479173735784842, -0.21713186390175285, -0.03403923708700391, 38.0] [-0.11987268456252011, 5.0, 0.35865755094331475, 2.0, 2.0, 0.0, 4.0, -0.14479173735784842, -0.21713186390175285, -0.8410484679825871, 38.0] [0.24480069389816542, 3.0, 1.525541514460902, 2.0, 9.0, 0.0, 4.0, -0.14479173735784842, -0.21713186390175285, 1.176474609256371, 6.0] [0.0989313425138912, 1.0, 1.525541514460902, 2.0, 9.0, 0.0, 4.0, 0.8868473744801746, -0.21713186390175285, -0.03403923708700391, 38.0] [0.39067004528243965, 5.0, -0.030303770229214273, 2.0, 4.0, 0.0, 4.0, -0.14479173735784842, -0.21713186390175285, 4.7273152251969375, 38.0] [1.1200168022038106, 3.0, 1.1365801932883728, 2.0, 12.0, 0.0, 4.0, -0.14479173735784842, -0.21713186390175285, -0.03403923708700391, 38.0] [-1.2868274956367138, 3.0, -0.030303770229214273, 4.0, 11.0, 3.0, 4.0, -0.14479173735784842, -0.21713186390175285, -1.6480576988781703, 38.0] [-0.7033500900996169, 3.0, -0.4192650914017433, 2.0, 7.0, 5.0, 4.0, -0.14479173735784842, 4.5022796885373735, -0.43754385253479555, 38.0] [0.5365393966667138, 3.0, 1.525541514460902, 4.0, 3.0, 4.0, 4.0, -0.14479173735784842, -0.21713186390175285, -0.03403923708700391, 38.0] [-1.3597621713288508, 3.0, -0.4192650914017433, 4.0, 6.0, 3.0, 2.0, -0.14479173735784842, -0.21713186390175285, -0.03403923708700391, 38.0] [0.5365393966667138, 3.0, 1.1365801932883728, 0.0, 11.0, 2.0, 2.0, -0.14479173735784842, -0.21713186390175285, -0.03403923708700391, 38.0] [1.338820829280222, 3.0, -0.4192650914017433, 2.0, 2.0, 0.0, 4.0, -0.14479173735784842, -0.21713186390175285, -0.03403923708700391, 38.0] [0.6824087480509881, 3.0, 1.1365801932883728, 0.0, 12.0, 3.0, 4.0, -0.14479173735784842, -0.21713186390175285, -0.03403923708700391, 38.0] [0.0259966668217541, 3.0, 1.1365801932883728, 2.0, 11.0, 0.0, 4.0, -0.14479173735784842, -0.21713186390175285, 0.3694653783607877, 38.0] [-0.4116113873310685, 3.0, -0.4192650914017433, 2.0, 11.0, 0.0, 4.0, -0.14479173735784842, -0.21713186390175285, 1.176474609256371, 38.0] [0.39067004528243965, 6.0, 1.1365801932883728, 4.0, 4.0, 1.0, 4.0, -0.14479173735784842, -0.21713186390175285, -0.03403923708700391, 38.0]
The gcloud
command-line tool accepts newline-delimited JSON for online
prediction, and this particular Keras model expects a flat list of numbers for
each input example.
AI Platform requires a different format when you make online
prediction requests to the REST API without using the gcloud
tool. The way you
structure your model may also change how you must format data for prediction.
Learn more about formatting data for online
prediction.
Submit the online prediction request
Use gcloud
to submit your online prediction request:
gcloud ai-platform predict \
--model $MODEL_NAME \
--region $REGION \
--version $MODEL_VERSION \
--json-instances prediction_input.json
DENSE_4 [0.6854287385940552] [0.011786997318267822] [0.037236183881759644] [0.016223609447479248] [0.0012015104293823242] [0.23621389269828796] [0.6174039244651794] [0.9822691679000854] [0.3815768361091614] [0.6715215444564819] [0.001094043254852295] [0.43077391386032104] [0.22132840752601624] [0.004075437784194946] [0.22736871242523193] [0.4111979305744171] [0.27328649163246155] [0.6981356143951416] [0.3309604525566101] [0.20807647705078125]
Since the model's last layer uses a sigmoid function for its activation, outputs between 0 and 0.5 represent negative predictions ("<=50K") and outputs between 0.5 and 1 represent positive ones (">50K").
Developing the Keras model from scratch
At this point, you have trained a machine learning model on AI Platform, deployed the trained model as a version resource on AI Platform, and received online predictions from the deployment. The next section walks through recreating the Keras code used to train your model. It covers the following parts of developing a machine learning model for use with AI Platform:
- Downloading and preprocessing data
- Designing and training the model
- Visualizing training and exporting the trained model
While this section provides more detailed insight to the tasks completed in
previous parts, to learn more about using tf.keras
, read TensorFlow's guide
to
Keras.
To learn more about structuring code as a training packge for
AI Platform, read Packaging a training
application
and reference the complete training
code,
which is structured as a Python package.
Import libraries and define constants
First, import Python libraries required for training:
import os
from six.moves import urllib
import tempfile
import numpy as np
import pandas as pd
import tensorflow as tf
# Examine software versions
print(__import__('sys').version)
print(tf.__version__)
print(tf.keras.__version__)
Then, define some useful constants:
- Information for downloading training and evaluation data
- Information required for Pandas to interpret the data and convert categorical fields into numeric features
- Hyperparameters for training, such as learning rate and batch size
### For downloading data ###
# Storage directory
DATA_DIR = os.path.join(tempfile.gettempdir(), 'census_data')
# Download options.
DATA_URL = 'https://storage.googleapis.com/cloud-samples-data/ai-platform' \
'/census/data'
TRAINING_FILE = 'adult.data.csv'
EVAL_FILE = 'adult.test.csv'
TRAINING_URL = '%s/%s' % (DATA_URL, TRAINING_FILE)
EVAL_URL = '%s/%s' % (DATA_URL, EVAL_FILE)
### For interpreting data ###
# These are the features in the dataset.
# Dataset information: https://archive.ics.uci.edu/ml/datasets/census+income
_CSV_COLUMNS = [
'age', 'workclass', 'fnlwgt', 'education', 'education_num',
'marital_status', 'occupation', 'relationship', 'race', 'gender',
'capital_gain', 'capital_loss', 'hours_per_week', 'native_country',
'income_bracket'
]
_CATEGORICAL_TYPES = {
'workclass': pd.api.types.CategoricalDtype(categories=[
'Federal-gov', 'Local-gov', 'Never-worked', 'Private', 'Self-emp-inc',
'Self-emp-not-inc', 'State-gov', 'Without-pay'
]),
'marital_status': pd.api.types.CategoricalDtype(categories=[
'Divorced', 'Married-AF-spouse', 'Married-civ-spouse',
'Married-spouse-absent', 'Never-married', 'Separated', 'Widowed'
]),
'occupation': pd.api.types.CategoricalDtype([
'Adm-clerical', 'Armed-Forces', 'Craft-repair', 'Exec-managerial',
'Farming-fishing', 'Handlers-cleaners', 'Machine-op-inspct',
'Other-service', 'Priv-house-serv', 'Prof-specialty', 'Protective-serv',
'Sales', 'Tech-support', 'Transport-moving'
]),
'relationship': pd.api.types.CategoricalDtype(categories=[
'Husband', 'Not-in-family', 'Other-relative', 'Own-child', 'Unmarried',
'Wife'
]),
'race': pd.api.types.CategoricalDtype(categories=[
'Amer-Indian-Eskimo', 'Asian-Pac-Islander', 'Black', 'Other', 'White'
]),
'native_country': pd.api.types.CategoricalDtype(categories=[
'Cambodia', 'Canada', 'China', 'Columbia', 'Cuba', 'Dominican-Republic',
'Ecuador', 'El-Salvador', 'England', 'France', 'Germany', 'Greece',
'Guatemala', 'Haiti', 'Holand-Netherlands', 'Honduras', 'Hong', 'Hungary',
'India', 'Iran', 'Ireland', 'Italy', 'Jamaica', 'Japan', 'Laos', 'Mexico',
'Nicaragua', 'Outlying-US(Guam-USVI-etc)', 'Peru', 'Philippines', 'Poland',
'Portugal', 'Puerto-Rico', 'Scotland', 'South', 'Taiwan', 'Thailand',
'Trinadad&Tobago', 'United-States', 'Vietnam', 'Yugoslavia'
]),
'income_bracket': pd.api.types.CategoricalDtype(categories=[
'<=50K', '>50K'
])
}
# This is the label (target) we want to predict.
_LABEL_COLUMN = 'income_bracket'
### Hyperparameters for training ###
# This the training batch size
BATCH_SIZE = 128
# This is the number of epochs (passes over the full training data)
NUM_EPOCHS = 20
# Define learning rate.
LEARNING_RATE = .01
Download and preprocess data
Download the data
Next, define functions to download training and evaluation data. These functions also fix minor irregularities in the data's formatting.
def _download_and_clean_file(filename, url):
"""Downloads data from url, and makes changes to match the CSV format.
The CSVs may use spaces after the comma delimters (non-standard) or include
rows which do not represent well-formed examples. This function strips out
some of these problems.
Args:
filename: filename to save url to
url: URL of resource to download
"""
temp_file, _ = urllib.request.urlretrieve(url)
with tf.gfile.Open(temp_file, 'r') as temp_file_object:
with tf.gfile.Open(filename, 'w') as file_object:
for line in temp_file_object:
line = line.strip()
line = line.replace(', ', ',')
if not line or ',' not in line:
continue
if line[-1] == '.':
line = line[:-1]
line += '\n'
file_object.write(line)
tf.gfile.Remove(temp_file)
def download(data_dir):
"""Downloads census data if it is not already present.
Args:
data_dir: directory where we will access/save the census data
"""
tf.gfile.MakeDirs(data_dir)
training_file_path = os.path.join(data_dir, TRAINING_FILE)
if not tf.gfile.Exists(training_file_path):
_download_and_clean_file(training_file_path, TRAINING_URL)
eval_file_path = os.path.join(data_dir, EVAL_FILE)
if not tf.gfile.Exists(eval_file_path):
_download_and_clean_file(eval_file_path, EVAL_URL)
return training_file_path, eval_file_path
Use those functions to download the data for training and verify that you have CSV files for training and evaluation:
training_file_path, eval_file_path = download(DATA_DIR)
Next, load these files using Pandas and examine the data:
# This census data uses the value '?' for fields (column) that are missing data.
# We use na_values to find ? and set it to NaN values.
# https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html
train_df = pd.read_csv(training_file_path, names=_CSV_COLUMNS, na_values='?')
eval_df = pd.read_csv(eval_file_path, names=_CSV_COLUMNS, na_values='?')
The following table shows an excerpt of the data (train_df.head()
) before
preprocessing:
age | workclass | fnlwgt | education | education_num | marital_status | occupation | relationship | race | gender | capital_gain | capital_loss | hours_per_week | native_country | income_bracket | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 39 | State-gov | 77516 | Bachelors | 13 | Never-married | Adm-clerical | Not-in-family | White | Male | 2174 | 0 | 40 | United-States | <=50K |
1 | 50 | Self-emp-not-inc | 83311 | Bachelors | 13 | Married-civ-spouse | Exec-managerial | Husband | White | Male | 0 | 0 | 13 | United-States | <=50K |
2 | 38 | Private | 215646 | HS-grad | 9 | Divorced | Handlers-cleaners | Not-in-family | White | Male | 0 | 0 | 40 | United-States | <=50K |
3 | 53 | Private | 234721 | 11th | 7 | Married-civ-spouse | Handlers-cleaners | Husband | Black | Male | 0 | 0 | 40 | United-States | <=50K |
4 | 28 | Private | 338409 | Bachelors | 13 | Married-civ-spouse | Prof-specialty | Wife | Black | Female | 0 | 0 | 40 | Cuba | <=50K |
Preprocess the data
The first preprocessing step removes certain features from the data and converts categorical features to numerical values for use with Keras.
Learn more about feature engineering and bias in data.
UNUSED_COLUMNS = ['fnlwgt', 'education', 'gender']
def preprocess(dataframe):
"""Converts categorical features to numeric. Removes unused columns.
Args:
dataframe: Pandas dataframe with raw data
Returns:
Dataframe with preprocessed data
"""
dataframe = dataframe.drop(columns=UNUSED_COLUMNS)
# Convert integer valued (numeric) columns to floating point
numeric_columns = dataframe.select_dtypes(['int64']).columns
dataframe[numeric_columns] = dataframe[numeric_columns].astype('float32')
# Convert categorical columns to numeric
cat_columns = dataframe.select_dtypes(['object']).columns
dataframe[cat_columns] = dataframe[cat_columns].apply(lambda x: x.astype(
_CATEGORICAL_TYPES[x.name]))
dataframe[cat_columns] = dataframe[cat_columns].apply(lambda x: x.cat.codes)
return dataframe
prepped_train_df = preprocess(train_df)
prepped_eval_df = preprocess(eval_df)
The following table (prepped_train_df.head()
) shows how preprocessing changed
the data. Notice in particular that income_bracket
, the label that you're
training the model to predict, has changed from <=50K
and >50K
to 0
and
1
:
age | workclass | education_num | marital_status | occupation | relationship | race | capital_gain | capital_loss | hours_per_week | native_country | income_bracket | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 39.0 | 6 | 13.0 | 4 | 0 | 1 | 4 | 2174.0 | 0.0 | 40.0 | 38 | 0 |
1 | 50.0 | 5 | 13.0 | 2 | 3 | 0 | 4 | 0.0 | 0.0 | 13.0 | 38 | 0 |
2 | 38.0 | 3 | 9.0 | 0 | 5 | 1 | 4 | 0.0 | 0.0 | 40.0 | 38 | 0 |
3 | 53.0 | 3 | 7.0 | 2 | 5 | 0 | 2 | 0.0 | 0.0 | 40.0 | 38 | 0 |
4 | 28.0 | 3 | 13.0 | 2 | 9 | 5 | 2 | 0.0 | 0.0 | 40.0 | 4 | 0 |
Next, separate the data into features ("x") and labels ("y"), and reshape the
label arrays into a format for use with tf.data.Dataset
later:
# Split train and test data with labels.
# The pop() method will extract (copy) and remove the label column from the dataframe
train_x, train_y = prepped_train_df, prepped_train_df.pop(_LABEL_COLUMN)
eval_x, eval_y = prepped_eval_df, prepped_eval_df.pop(_LABEL_COLUMN)
# Reshape label columns for use with tf.data.Dataset
train_y = np.asarray(train_y).astype('float32').reshape((-1, 1))
eval_y = np.asarray(eval_y).astype('float32').reshape((-1, 1))
Scaling training data so each numerical feature column has a mean of 0 and a standard deviation of 1 can improve your model.
In a production system, you may want to save the means and standard deviations from your training set and use them to perform an identical transformation on test data at prediction time. For convenience in this exercise, temporarily combine the training and evaluation data to scale all of them:
def standardize(dataframe):
"""Scales numerical columns using their means and standard deviation to get
z-scores: the mean of each numerical column becomes 0, and the standard
deviation becomes 1. This can help the model converge during training.
Args:
dataframe: Pandas dataframe
Returns:
Input dataframe with the numerical columns scaled to z-scores
"""
dtypes = list(zip(dataframe.dtypes.index, map(str, dataframe.dtypes)))
# Normalize numeric columns.
for column, dtype in dtypes:
if dtype == 'float32':
dataframe[column] -= dataframe[column].mean()
dataframe[column] /= dataframe[column].std()
return dataframe
# Join train_x and eval_x to normalize on overall means and standard
# deviations. Then separate them again.
all_x = pd.concat([train_x, eval_x], keys=['train', 'eval'])
all_x = standardize(all_x)
train_x, eval_x = all_x.xs('train'), all_x.xs('eval')
The next table (train_x.head()
) shows what the fully preprocessed data looks
like:
age | workclass | education_num | marital_status | occupation | relationship | race | capital_gain | capital_loss | hours_per_week | native_country | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0.025997 | 6 | 1.136580 | 4 | 0 | 1 | 4 | 0.146933 | -0.217132 | -0.034039 | 38 |
1 | 0.828278 | 5 | 1.136580 | 2 | 3 | 0 | 4 | -0.144792 | -0.217132 | -2.212964 | 38 |
2 | -0.046938 | 3 | -0.419265 | 0 | 5 | 1 | 4 | -0.144792 | -0.217132 | -0.034039 | 38 |
3 | 1.047082 | 3 | -1.197188 | 2 | 5 | 0 | 2 | -0.144792 | -0.217132 | -0.034039 | 38 |
4 | -0.776285 | 3 | 1.136580 | 2 | 9 | 5 | 2 | -0.144792 | -0.217132 | -0.034039 | 4 |
Design and train the model
Create training and validation datasets
Create an input function to convert features and labels into a
tf.data.Dataset
for training or evaluation:
def input_fn(features, labels, shuffle, num_epochs, batch_size):
"""Generates an input function to be used for model training.
Args:
features: numpy array of features used for training or inference
labels: numpy array of labels for each example
shuffle: boolean for whether to shuffle the data or not (set True for
training, False for evaluation)
num_epochs: number of epochs to provide the data for
batch_size: batch size for training
Returns:
A tf.data.Dataset that can provide data to the Keras model for training or
evaluation
"""
if labels is None:
inputs = features
else:
inputs = (features, labels)
dataset = tf.data.Dataset.from_tensor_slices(inputs)
if shuffle:
dataset = dataset.shuffle(buffer_size=len(features))
# We call repeat after shuffling, rather than before, to prevent separate
# epochs from blending together.
dataset = dataset.repeat(num_epochs)
dataset = dataset.batch(batch_size)
return dataset
Next, create these training and evaluation datasets.Use the NUM_EPOCHS
and BATCH_SIZE
hyperparameters defined previously to define how the training
dataset provides examples to the model during training. Set up the validation
dataset to provide all its examples in one batch, for a single validation step
at the end of each training epoch.
# Pass a numpy array by using DataFrame.values
training_dataset = input_fn(features=train_x.values,
labels=train_y,
shuffle=True,
num_epochs=NUM_EPOCHS,
batch_size=BATCH_SIZE)
num_eval_examples = eval_x.shape[0]
# Pass a numpy array by using DataFrame.values
validation_dataset = input_fn(features=eval_x.values,
labels=eval_y,
shuffle=False,
num_epochs=NUM_EPOCHS,
batch_size=num_eval_examples)
Design a Keras Model
Design your neural network using the Keras Sequential API.
This deep neural network (DNN) has several hidden layers, and the last layer uses a sigmoid activation function to output a value between 0 and 1:
- The input layer has 100 units using the ReLU activation function.
- The hidden layer has 75 units using the ReLU activation function.
- The hidden layer has 50 units using the ReLU activation function.
- The hidden layer has 25 units using the ReLU activation function.
- The output layer has 1 units using a sigmoid activation function.
- The optimizer uses the binary cross-entropy loss function, which is appropriate for a binary classification problem like this one.
Feel free to change these layers to try to improve the model:
def create_keras_model(input_dim, learning_rate):
"""Creates Keras Model for Binary Classification.
Args:
input_dim: How many features the input has
learning_rate: Learning rate for training
Returns:
The compiled Keras model (still needs to be trained)
"""
Dense = tf.keras.layers.Dense
model = tf.keras.Sequential(
[
Dense(100, activation=tf.nn.relu, kernel_initializer='uniform',
input_shape=(input_dim,)),
Dense(75, activation=tf.nn.relu),
Dense(50, activation=tf.nn.relu),
Dense(25, activation=tf.nn.relu),
Dense(1, activation=tf.nn.sigmoid)
])
# Custom Optimizer:
# https://www.tensorflow.org/api_docs/python/tf/train/RMSPropOptimizer
optimizer = tf.keras.optimizers.RMSprop(
lr=learning_rate)
# Compile Keras model
model.compile(
loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
return model
Next, create the Keras model object:
num_train_examples, input_dim = train_x.shape
print('Number of features: {}'.format(input_dim))
print('Number of examples: {}'.format(num_train_examples))
keras_model = create_keras_model(
input_dim=input_dim,
learning_rate=LEARNING_RATE)
Examining the model with keras_model.summary()
should return something like:
Number of features: 11 Number of examples: 32561 WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/resource_variable_ops.py:435: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version. Instructions for updating: Colocations handled automatically by placer. _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= dense (Dense) (None, 100) 1200 _________________________________________________________________ dense_1 (Dense) (None, 75) 7575 _________________________________________________________________ dense_2 (Dense) (None, 50) 3800 _________________________________________________________________ dense_3 (Dense) (None, 25) 1275 _________________________________________________________________ dense_4 (Dense) (None, 1) 26 ================================================================= Total params: 13,876 Trainable params: 13,876 Non-trainable params: 0 _________________________________________________________________
Train and evaluate the model
Define a learning rate decay to encourage model parameters to make smaller changes as training goes on:
# Setup Learning Rate decay.
lr_decay_cb = tf.keras.callbacks.LearningRateScheduler(
lambda epoch: LEARNING_RATE + 0.02 * (0.5 ** (1 + epoch)),
verbose=True)
# Setup TensorBoard callback.
JOB_DIR = os.getenv('JOB_DIR')
tensorboard_cb = tf.keras.callbacks.TensorBoard(
os.path.join(JOB_DIR, 'keras_tensorboard'),
histogram_freq=1)
Finally, train the model. Provide the appropriate steps_per_epoch
for the
model to train on the entire training dataset (with BATCH_SIZE
examples per
step) during each epoch. And instruct the model to calculate validation accuracy
with one big validation batch at the end of each epoch.
history = keras_model.fit(training_dataset,
epochs=NUM_EPOCHS,
steps_per_epoch=int(num_train_examples/BATCH_SIZE),
validation_data=validation_dataset,
validation_steps=1,
callbacks=[lr_decay_cb, tensorboard_cb],
verbose=1)
Training progress may look like the following:
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.cast instead. Epoch 00001: LearningRateScheduler reducing learning rate to 0.02. Epoch 1/20 254/254 [==============================] - 1s 5ms/step - loss: 0.6986 - acc: 0.7893 - val_loss: 0.3894 - val_acc: 0.8329 Epoch 00002: LearningRateScheduler reducing learning rate to 0.015. Epoch 2/20 254/254 [==============================] - 1s 4ms/step - loss: 0.3574 - acc: 0.8335 - val_loss: 0.3861 - val_acc: 0.8131 ... Epoch 00019: LearningRateScheduler reducing learning rate to 0.010000038146972657. Epoch 19/20 254/254 [==============================] - 1s 4ms/step - loss: 0.3239 - acc: 0.8512 - val_loss: 0.3334 - val_acc: 0.8496 Epoch 00020: LearningRateScheduler reducing learning rate to 0.010000019073486329. Epoch 20/20 254/254 [==============================] - 1s 4ms/step - loss: 0.3279 - acc: 0.8504 - val_loss: 0.3174 - val_acc: 0.8523
Visualize training and export the trained model
Visualize training
Import matplotlib
to visualize how the model learned over the training period.
(If necessary, first install it with pip install matplotlib
.)
from matplotlib import pyplot as plt
Plot the model's loss (binary cross-entropy) and accuracy, as measured at the end of each training epoch:
# Visualize History for Loss.
plt.title('Keras model loss')
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['training', 'validation'], loc='upper right')
plt.show()
# Visualize History for Accuracy.
plt.title('Keras model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.legend(['training', 'validation'], loc='lower right')
plt.show()
Over time, loss decreases and accuracy increases. But do they converge to a stable level? Are there big differences between the training and validation metrics (a sign of overfitting)?
Learn about how to improve your machine learning model. Then, feel free to adjust hyperparameters or the model architecture and train again.
Export the model for serving
Use tf.contrib.saved_model.save_keras_model to export a TensorFlow SavedModel directory. This is the format that AI Platform requires when you create a model version resource.
Since not all optimizers can be exported to the SavedModel format, you may see warnings during the export process. As long you successfully export a serving graph, AI Platform can used the SavedModel to serve predictions.
# Export the model to a local SavedModel directory
export_path = tf.contrib.saved_model.save_keras_model(keras_model, 'keras_export')
print("Model exported to: ", export_path)
WARNING: The TensorFlow contrib module will not be included in TensorFlow 2.0. For more information, please see: * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md * https://github.com/tensorflow/addons If you depend on functionality not listed there, please file an issue. WARNING:tensorflow:This model was compiled with a Keras optimizer (<tensorflow.python.keras.optimizers.RMSprop object at 0x7fc198c4e400>) but is being saved in TensorFlow format with `save_weights`. The model's weights will be saved, but unlike with TensorFlow optimizers in the TensorFlow format the optimizer's state will not be saved. Consider using a TensorFlow optimizer from `tf.train`. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/network.py:1436: update_checkpoint_state (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version. Instructions for updating: Use tf.train.CheckpointManager to manage checkpoints rather than manually editing the Checkpoint proto. WARNING:tensorflow:Model was compiled with an optimizer, but the optimizer is not from `tf.train` (e.g. `tf.train.AdagradOptimizer`). Only the serving graph was exported. The train and evaluate graphs were not added to the SavedModel. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/saved_model/signature_def_utils_impl.py:205: build_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version. Instructions for updating: This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.build_tensor_info or tf.compat.v1.saved_model.build_tensor_info. INFO:tensorflow:Signatures INCLUDED in export for Classify: None INFO:tensorflow:Signatures INCLUDED in export for Regress: None INFO:tensorflow:Signatures INCLUDED in export for Predict: ['serving_default'] INFO:tensorflow:Signatures INCLUDED in export for Train: None INFO:tensorflow:Signatures INCLUDED in export for Eval: None INFO:tensorflow:No assets to save. INFO:tensorflow:No assets to write. INFO:tensorflow:SavedModel written to: keras_export/1553710367/saved_model.pb Model exported to: b'keras_export/1553710367'
You may export a SavedModel directory to your local filesystem or to
Cloud Storage, as long as you have the necessary permissions. In your
current environment, you granted access to Cloud Storage by
authenticating your Google Cloud account and setting the
GOOGLE_APPLICATION_CREDENTIALS
environment variable. AI Platform
training jobs can also export directly to Cloud Storage, because
AI Platform service accounts have access to Cloud Storage
buckets in their own
project.
Try exporting directly to Cloud Storage:
JOB_DIR = os.getenv('JOB_DIR')
# Export the model to a SavedModel directory in Cloud Storage
export_path = tf.contrib.saved_model.save_keras_model(keras_model, JOB_DIR + '/keras_export')
print("Model exported to: ", export_path)
WARNING:tensorflow:This model was compiled with a Keras optimizer (<tensorflow.python.keras.optimizers.RMSprop object at 0x7fc198c4e400>) but is being saved in TensorFlow format with `save_weights`. The model's weights will be saved, but unlike with TensorFlow optimizers in the TensorFlow format the optimizer's state will not be saved. Consider using a TensorFlow optimizer from `tf.train`. WARNING:tensorflow:Model was compiled with an optimizer, but the optimizer is not from `tf.train` (e.g. `tf.train.AdagradOptimizer`). Only the serving graph was exported. The train and evaluate graphs were not added to the SavedModel. INFO:tensorflow:Signatures INCLUDED in export for Classify: None INFO:tensorflow:Signatures INCLUDED in export for Regress: None INFO:tensorflow:Signatures INCLUDED in export for Predict: ['serving_default'] INFO:tensorflow:Signatures INCLUDED in export for Train: None INFO:tensorflow:Signatures INCLUDED in export for Eval: None INFO:tensorflow:No assets to save. INFO:tensorflow:No assets to write. INFO:tensorflow:SavedModel written to: gs://your-bucket-name/keras-job-dir/keras_export/1553710379/saved_model.pb Model exported to: b'gs://your-bucket-name/keras-job-dir/keras_export/1553710379'
You can now deploy this model to AI Platform and serve predictions by following the steps from the quickstart for prediction.
Cleaning up
To clean up all Google Cloud resources used in this project, you can delete the Google Cloud project you used for the tutorial.
Alternatively, you can clean up individual resources by running the following commands:
# Delete model version resource
gcloud ai-platform versions delete $MODEL_VERSION --quiet --model $MODEL_NAME
# Delete model resource
gcloud ai-platform models delete $MODEL_NAME --quiet
# Delete Cloud Storage objects that were created
gsutil -m rm -r $JOB_DIR
# If training job is still running, cancel it
gcloud ai-platform jobs cancel $JOB_NAME --quiet
If your Cloud Storage bucket doesn't contain any other objects and you
would like to delete it, run gsutil rm -r gs://$BUCKET_NAME
.
What's next?
- View the complete training code used in this guide, which structures the code to accept custom hyperparameters as command-line flags.
- Read about packaging code for an AI Platform training job.
- Read about deploying a model to serve predictions.