Image Classification using Flowers dataset

This tutorial uses the Flowers dataset to build a customized image classification model via transfer learning and the existing Inception-v3 model in order to correctly label different types of flowers using AI Platform.

The sample code you will walk through and results you will monitor consist of four parts: data preprocessing, model training with the transformed data, model deployment, and prediction request steps. All parts will be completed in the cloud.

What you will build

You will run sample code in order to preprocess data with Cloud Dataflow and then use that transformed data to train a model with AI Platform. You will then deploy the trained model to AI Platform and test the model by sending a prediction request to it.

In this sample dataset you only have a small set of images (~3,600). Without more data it isn’t possible to use machine learning techniques to adequately train an accurate classification model from scratch. Instead, you’ll use an approach called transfer learning. In transfer learning you use a pre-trained model to extract image features that you will use to train a new classifier. In this tutorial in particular you’ll use a pre-trained model called Inception.


In this introductory, end-to-end walkthrough you will use Python code to:

  • Perform data preprocessing in the cloud, reading original data files and using Cloud Dataflow to convert them into TFRecord format for training.
  • Run training using AI Platform to obtain optimal model specifications.
  • Deploy the trained model.
  • Request a prediction from the trained model and observe the resulting accuracy.


This walkthrough uses billable components of Google Cloud Platform, including:

  • Cloud Dataflow for:
    • Preprocessing data
  • AI Platform for:
    • Training
    • Making a prediction request
  • Cloud Storage for:
    • Storing input data for training
    • Staging the trainer package
    • Writing training artifacts

Use the Pricing Calculator to generate a cost estimate based on your projected usage.

New Cloud Platform users might be eligible for a free trial.

Before you begin

Download the sample from the GitHub repository


  1. Download and extract the AI Platform sample zip file.

  2. Open a terminal window and navigate to the directory that contains the extracted cloudml-samples-master directory.

  3. Navigate to the cloudml-samples-master > flowers directory. The commands in this walkthrough must be run from the flowers directory.

    cd cloudml-samples-master/flowers

Cloud Shell

  1. Download the AI Platform sample zip file.

  2. Unzip the file to extract the cloudml-samples-master directory.

  3. Navigate to the cloudml-samples-master > flowers directory. The commands in this walkthrough must be run from the flowers directory.

    cd cloudml-samples-master/flowers

Install dependencies

The sample provides a requirements.txt file that you can use to install the dependencies required by the project.

pip install --user -r requirements.txt

Set up and test your Cloud environment

Complete the following steps to set up a GCP account, activate the AI Platform API, and install and activate the Cloud SDK.

Set up your GCP project

  1. Sign in to your Google Account.

    If you don't already have one, sign up for a new account.

  2. Select or create a Google Cloud Platform project.

    Go to the Manage resources page

  3. Make sure that billing is enabled for your Google Cloud Platform project.

    Learn how to enable billing

  4. Enable the AI Platform ("Cloud Machine Learning Engine"), Compute Engine and Cloud Dataflow APIs.

    Enable the APIs

  5. Install and initialize the Cloud SDK.

Set up your environment

Choose one of the options below to set up your environment locally on macOS or in a remote environment on Cloud Shell.

For macOS users, we recommend that you set up your environment using the MACOS tab below. Cloud Shell, shown on the CLOUD SHELL tab, is available on macOS, Linux, and Windows. Cloud Shell provides a quick way to try AI Platform, but isn’t suitable for ongoing development work.


  1. Check Python installation
    Confirm that you have Python installed and, if necessary, install it.

    python -V
  2. Check pip installation
    pip is Python’s package manager, included with current versions of Python. Check if you already have pip installed by running pip --version. If not, see how to install pip.

    You can upgrade pip using the following command:

    pip install -U pip

    See the pip documentation for more details.

  3. Install virtualenv
    virtualenv is a tool to create isolated Python environments. Check if you already have virtualenv installed by running virtualenv --version. If not, install virtualenv:

    pip install --user --upgrade virtualenv

    To create an isolated development environment for this guide, create a new virtual environment in virtualenv. For example, the following command activates an environment named cmle-env:

    virtualenv cmle-env
    source cmle-env/bin/activate
  4. For the purposes of this tutorial, run the rest of the commands within your virtual environment.

    See more information about using virtualenv. To exit virtualenv, run deactivate.

Cloud Shell

  1. Open the Google Cloud Platform Console.

    Google Cloud Platform Console

  2. Click the Activate Google Cloud Shell button at the top of the console window.

    Activate Google Cloud Shell

    A Cloud Shell session opens inside a new frame at the bottom of the console and displays a command-line prompt. It can take a few seconds for the shell session to be initialized.

    Cloud Shell session

    Your Cloud Shell session is ready to use.

  3. Configure the gcloud command-line tool to use your selected project.

    gcloud config set project [selected-project-id]

    where [selected-project-id] is your project ID. (Omit the enclosing brackets.)

Verify the Google Cloud SDK components

To verify that the Google Cloud SDK components are installed:

  1. List your models:

    gcloud ai-platform models list
  2. If you have not created any models before, the command returns an empty list:

    Listed 0 items.

    After you start creating models, you can see them listed by using this command.

  3. If you have installed gcloud previously, update gcloud:

    gcloud components update

Install TensorFlow

To install TensorFlow, run the following command:

pip install --user --upgrade tensorflow

Verify the installation:

python -c "import tensorflow as tf; print('TensorFlow version {} is installed.'.format(tf.VERSION))"

You can ignore any warnings that the TensorFlow library wasn't compiled to use certain instructions.

For more information about installing TensorFlow, including other installation options and troubleshooting information, see the TensorFlow documentation.

Set up your Cloud Storage bucket

This section shows you how to create a new bucket. You can use an existing bucket, but if it is not part of the project you are using to run AI Platform, you must explicitly grant access to the AI Platform service accounts.

  1. Specify a name for your new bucket. The name must be unique across all buckets in Cloud Storage.


    For example, use your project name with -mlengine appended:

    PROJECT_ID=$(gcloud config list project --format "value(core.project)")
  2. Check the bucket name that you created.

    echo $BUCKET_NAME
  3. Select a region for your bucket and set a REGION environment variable.

    For example, the following code creates REGION and sets it to us-central1:

  4. Create the new bucket:

    gsutil mb -l $REGION gs://$BUCKET_NAME

    Note: Use the same region where you plan on running AI Platform jobs. The example uses us-central1 because that is the region used in the getting-started instructions.

Declaring variables

Start by declaring all variables and making them read-only.

 declare -r BUCKET_NAME="${your_bucket_name}"
 declare -r REGION="your_valid_region"
 declare -r PROJECT_ID=$(gcloud config list project --format "value(core.project)")
 declare -r JOB_NAME="flowers_${USER}_$(date +%Y%m%d_%H%M%S)"
 declare -r GCS_PATH="gs://${BUCKET_NAME}/${USER}/${JOB_NAME}"
 declare -r DICT_FILE=gs://cloud-ml-data/img/flower_photos/dict.txt

 declare -r MODEL_NAME=flowers
 declare -r VERSION_NAME=v1

 echo "Using job id: " $JOB_NAME
 set -v -e

Note that the BUCKET_NAME and REGION variables are user specific, so you must explicitly fill in the name of your own project's bucket and region. For help with choosing a region, see the guide to available regions for AI Platform services.

Preprocessing training and evaluation data in the cloud

For the sake of this tutorial the original dataset of labeled flower images has been randomly split into training and evaluation datasets. Of the original data, 90% is reserved for training and 10% is reserved for evaluation. You start with these two separate files which are stored in a Google-owned Cloud Storage bucket. You then preprocess these two Google-hosted datasets to extract the image features from the bottleneck layer (typically the penultimate layer just before the final output layer that actually does the classification) of the Inception network. You save the output of this preprocessing in your own Cloud Storage bucket and use these files for training.

Preprocessing takes about 60 minutes per dataset to complete.

Evaluation data preprocessing

Start with the evaluation data preprocessing.

python trainer/ \
    --input_dict "$DICT_FILE" \
    --input_path "gs://cloud-ml-data/img/flower_photos/eval_set.csv" \
    --output_path "${GCS_PATH}/preproc/eval" \

Monitor preprocessing

To monitor preprocessing progress you can either use the command line tool for text log entries or Dataflow's web-based monitoring user interface in the console which provides a graphical representation of each pipeline and details about your job's status and execution.

Print log entries with the following commands:

gcloud dataflow jobs list
export JOB_ID="{corresponding_id}"
gcloud beta dataflow logs list $JOB_ID --importance=detailed

Or navigate to Dataflow's monitoring user interface:

Cloud Dataflow Monitoring UI

Training data preprocessing

Preprocess training data in the same manner as the evaluation data.

python trainer/ \
    --input_dict "$DICT_FILE" \
    --input_path "gs://cloud-ml-data/img/flower_photos/train_set.csv" \
    --output_path "${GCS_PATH}/preproc/train" \

Monitoring progress is the same as the evaluation data.

Run model training in the cloud

Once preprocessing on AI Platform has fully completed you are ready to train a simple classifier. Verify that preprocessing has finished by navigating to the Dataflow console page and checking the job status.

Cloud Dataflow Monitoring UI

Run model training in the cloud with the following command:

gcloud ai-platform jobs submit training "$JOB_NAME" \
    --stream-logs \
    --module-name trainer.task \
    --package-path trainer \
    --staging-bucket "gs://$BUCKET_NAME" \
    --region "$REGION" \
    --runtime-version=1.10 \
    -- \
    --output_path "${GCS_PATH}/training" \
    --eval_data_paths "${GCS_PATH}/preproc/eval*" \
    --train_data_paths "${GCS_PATH}/preproc/train*"

Monitor training progress

Similar to preprocessing monitoring, you can monitor training progress via simple text logs in the command line or by launching TensorBoard visualization tool and going to the summary logs produced during training.

Stream logs with the command line:

gcloud ai-platform jobs stream-logs "$JOB_NAME"

Or open TensorBoard visualization tool:


  1. Launch TensorBoard:

    OUTPUT_PATH = "${GCS_PATH}/training"
    tensorboard --logdir=$OUTPUT_PATH
  2. Once you start running TensorBoard, you can access it in your browser at http://localhost:6006

Cloud Shell

  1. Launch TensorBoard:

    OUTPUT_PATH = "${GCS_PATH}/training"
    tensorboard --logdir=$OUTPUT_PATH --port=8080
  2. Select "Preview on port 8080" from the Web Preview menu at the top of the command line.

You can shut down TensorBoard at any time by typing ctrl+c on the command line.

Deploying and using the model for prediction

You now have a SavedModel stored on Cloud Storage ready to be exported. Think of the "model" that is being created with this command as a container for the optimal computation artifact (Tensorflow graphs) that were created during training.

  1. Use the gcloud ai-platform models create command to name and export the SavedModel to AI Platform.

    gcloud ai-platform models create "$MODEL_NAME" \
      --regions "$REGION"
  2. Create the first version of the model. Creating a version actually deploys your uniquely defined model to a Cloud instance, and gets it ready to serve (predict).

    gcloud ai-platform versions create "$VERSION_NAME" \
      --model "$MODEL_NAME" \
      --origin "${GCS_PATH}/training/model" \

AI Platform automatically sets the first version of your model to the default.

Send a prediction request to the trained model

After you deploy your model you are able to test its online prediction capability. You do this by downloading an example flower (daisy) image from the eval set and then send a request message.

  1. Copy the image to your local disk.

      gsutil cp \
        gs://cloud-ml-data/img/flower_photos/daisy/100080576_f52e8ee070_n.jpg \

    In order to make a prediction request the above JPEG file must be encoded (using base64 encoding) in a JSON string.

  2. Create a request message locally in JSON format from the downloaded daisy JPEG file.

      python -c 'import base64, sys, json; img = base64.b64encode(open(sys.argv[1], "rb").read()); print json.dumps({"key":"0", "image_bytes": {"b64": img}})' daisy.jpg &> request.json

    The above code is AI Platform online prediction in action.

    If you are deploying a new version of your model you might have to wait for your request to be fulfilled. This could take up to 10 minutes. To check the status of your deployed version navigate to AI Platform > Models on Google Cloud Platform Console.

    After a few minutes, the model will become available for use via AI Platform’s prediction API. When it does use the predict command to send a prediction request to AI Platform for the given instance.

  3. Call the prediction service API to get classifications.

      gcloud ai-platform predict --model ${MODEL_NAME} --json-instances request.json

    The response should look something like this, where prediction number corresponds to flower type (daisy - 0, dandelion - 1, roses - 2, sunflowers - 3, tulips - 4):

0      0                    [0.9980067610740662, 0.00011650333908619359,
0.00028453863342292607, 0.0006193328299559653, 0.0009433324448764324,

The prediction accuracy after running this code should be approximately 99.8%.

Deleting the project

The easiest way to eliminate billing is to delete the project that you created for the tutorial.

To delete the project:

  1. In the GCP Console, go to the Projects page.

    Go to the Projects page

  2. In the project list, select the project you want to delete and click Delete delete.
  3. In the dialog, type the project ID, and then click Shut down to delete the project.

Cleaning up

To avoid incurring charges to your Google Cloud Platform account for the resources used in this tutorial:

  1. Open a terminal window (if not already open).

  2. Use the gsutil rm command with the -r flag to delete the directory that contains your most recent job:

    gsutil rm -r gs://$BUCKET_NAME/$JOB_NAME

If successful, the command returns a message similar to:

Removing gs://my-awesome-bucket/just-a-folder/cloud-storage.logo.png#1456530077282000...
Removing gs://my-awesome-bucket/...

Repeat the command for any other directories that you created for this sample.

Alternately, if you have no other data stored in the bucket, you can run the gsutil rm -r command on the bucket itself.

What's next

You've now completed a walkthrough of an AI Platform code sample that uses flower image data for preprocessing, training, deployment and prediction. You ran code to preprocess the image data using Cloud Dataflow, performed training in the cloud, then deployed a model and used it to get an online prediction.

The following resources can help you continue learning about AI Platform.

Was this page helpful? Let us know how we did:

Send feedback about...

AI Platform for TensorFlow