Hello image data: Setting up your project and environment

Set up your Google Cloud project to use AI Platform (Unified). Then create a Cloud Storage bucket and copy image files to use for training an AutoML image classification model.

This tutorial has several pages:

  1. Setting up your project and environment.

  2. Creating an image classification dataset and importing images.

  3. Training an AutoML image classification model.

  4. Deploying a model to an endpoint and send a prediction.

  5. Cleaning up your project.

Each page assumes that you have already performed the instructions from the previous pages of the tutorial.

Before you begin

Throughout this tutorial, use Google Cloud Console to interact with Google Cloud. Complete the following steps before using AI Platform (Unified) functionality.

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud Console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Make sure that billing is enabled for your Cloud project. Learn how to confirm that billing is enabled for your project.

  4. Enable the AI Platform (Unified) API.

    Enable the API

  5. Create a service account:

    1. In the Cloud Console, go to the Create service account page.

      Go to Create service account
    2. Select a project.
    3. In the Service account name field, enter a name. The Cloud Console fills in the Service account ID field based on this name.

      In the Service account description field, enter a description. For example, Service account for quickstart.

    4. Click Create.
    5. Click the Select a role field.

      Under Quick access, click Basic, then click Owner.

    6. Click Continue.
    7. Click Done to finish creating the service account.

      Do not close your browser window. You will use it in the next step.

  6. Create a service account key:

    1. In the Cloud Console, click the email address for the service account that you created.
    2. Click Keys.
    3. Click Add key, then click Create new key.
    4. Click Create. A JSON key file is downloaded to your computer.
    5. Click Close.
  7. Set the environment variable GOOGLE_APPLICATION_CREDENTIALS to the path of the JSON file that contains your service account key. This variable only applies to your current shell session, so if you open a new session, set the variable again.

  8. In the Cloud Console, activate Cloud Shell.

    Activate Cloud Shell

    At the bottom of the Cloud Console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Cloud SDK already installed, including the gcloud command-line tool, and with values already set for your current project. It can take a few seconds for the session to initialize.

1. Create a Cloud Storage bucket

Create a Cloud Storage bucket for the rest of this tutorial. The bucket must have the following specifications:

  1. Is regional (not multi-region)
  2. Is us-central1 region based

As you follow the tutorial, use the bucket to hold image data to import into an image dataset and use for an AutoML image classification training job.

To create the Cloud Storage bucket, run the following command in your Cloud Shell session:

  1. Open Cloud Shell.

  2. Set the PROJECT_ID variable.

    export PROJECT_ID=PROJECT_ID
  3. Create a Google Cloud Storage bucket.

    The following command creates a storage bucket in the us-central1 region, with the name PROJECT_ID.

    gsutil mb -p ${PROJECT_ID} -l us-central1 gs://${PROJECT_ID}/
  4. Set the BUCKET variable.

    export BUCKET=${PROJECT_ID}

2. Copy sample images into your bucket

Next, copy the flower dataset used in this Tensorflow blog post. The images are stored in a public Cloud Storage bucket, so you can copy the image files directly from the public bucket to your own bucket.

  1. In your Cloud Shell session, enter the following command to copy the images to your Cloud Storage bucket:

    gsutil -m cp -R gs://cloud-samples-data/ai-platform/flowers/ gs://${BUCKET}/img/

    The file copying takes about 20 minutes to complete.

    This command also copies the all_data.csv file, which lists the original filenames and their labels.

3. Create a CSV file with image locations and labels

The CSV file you just copied to your bucket lists all of the image locations in the public cloud-samples-data bucket and the label for each image. You will now modify this CSV file to create a new CSV file that lists image locations in your bucket and their labels:

  1. Update the CSV file to point to the files in your own bucket:

    gsutil cat gs://${BUCKET}/img/flowers/all_data.csv | sed "s:cloud-ml-data/img/flower_photos/:${BUCKET}/img/flowers/:" > all_data.csv
  2. Copy the CSV file into your bucket:

    gsutil cp all_data.csv gs://${BUCKET}/csv/

CSV format

The CSV you now have in your Cloud Storage bucket has two columns: the first column lists an image's URI in Cloud Storage, and the second column contains the image's label. Below you can see some sample rows:

gs://${BUCKET}/csv/all_data.csv:

gs://${BUCKET}/img/flowers/daisy/10559679065_50d2b16f6d.jpg,daisy
gs://${BUCKET}/img/flowers/dandelion/10828951106_c3cd47983f.jpg,dandelion
gs://${BUCKET}/img/flowers/roses/14312910041_b747240d56_n.jpg,roses
gs://${BUCKET}/img/flowers/sunflowers/127192624_afa3d9cb84.jpg,sunflowers
gs://${BUCKET}/img/flowers/tulips/13979098645_50b9eebc02_n.jpg,tulips

What's next

Follow the next page of this tutorial to use the AI Platform (Unified) UI to create an image classification dataset and import the images you copied to your Cloud Storage bucket.