Hello image data: Creating an image classification dataset and importing images

Use the Google Cloud Console to create an image classification dataset. After your dataset is created, use a CSV pointing to images in a public Cloud Storage bucket to import those images into the dataset.

This tutorial has several pages:

  1. Setting up your project and environment.

  2. Creating an image classification dataset and importing images.

  3. Training an AutoML image classification model.

  4. Deploying a model to an endpoint and send a prediction.

  5. Cleaning up your project.

Each page assumes that you have already performed the instructions from the previous pages of the tutorial.

Image data input file

The image files you use in this tutorial are from the flower dataset used in this Tensorflow blog post. These input images are stored in a public Cloud Storage bucket. This publicly-accessible bucket also contains a CSV file you use for data import. This file has two columns: the first column lists an image's URI in Cloud Storage, and the second column contains the image's label. Below you can see some sample rows:

gs://cloud-samples-data/ai-platform/flowers/flowers.csv:

gs://cloud-samples-data/ai-platform/flowers/daisy/10559679065_50d2b16f6d.jpg,daisy
gs://cloud-samples-data/ai-platform/flowers/dandelion/10828951106_c3cd47983f.jpg,dandelion
gs://cloud-samples-data/ai-platform/flowers/roses/14312910041_b747240d56_n.jpg,roses
gs://cloud-samples-data/ai-platform/flowers/sunflowers/127192624_afa3d9cb84.jpg,sunflowers
gs://cloud-samples-data/ai-platform/flowers/tulips/13979098645_50b9eebc02_n.jpg,tulips

1. Create an image classification dataset and import data

Visit the Cloud Console to begin the process of creating your dataset and training your image classification model.

When prompted, make sure to select the project that you used for your Cloud Storage bucket.

  1. From the Get started with Vertex AI page, click Create dataset.

    Vertex AI dashboard

  2. Specify a name for this dataset (optional).

  3. In the Image tab of the "Select an objective" section, choose the Image classification (Single-label) radio option. In the Region drop-down menu select US Central.

    New dataset window

  4. Select Create to create the empty dataset. After selecting Create you will advance to the data import window.

  5. Select the Select import files from Cloud Storage and specify the Cloud Storage URI of the CSV file with the image location and label data. For this quickstart, the CSV file is at gs://cloud-samples-data/ai-platform/flowers/flowers.csv. Copy and paste the following into the "Import file path" field:

    • cloud-samples-data/ai-platform/flowers/flowers.csv

    Select file import window

  6. Click Continue to begin image import. The import process takes a few minutes. When it completes, you are taken to the next page that shows all of the images identified for your dataset, both labeled and unlabeled images.

What's next

Follow the next page of this tutorial to start an AutoML model training job.