Hello text data: Setting up your project and environment

This tutorial demonstrates how to create a model for classifying content using Vertex AI. The tutorial trains an AutoML model by using a corpus of crowd-sourced "happy moments" from the Kaggle open-source dataset HappyDB. The resulting model classifies happy moments into categories reflecting the causes of happiness.

For this part of the tutorial, you will set up your Google Cloud project to use Vertex AI and a Cloud Storage bucket that will contain the documents for training your AutoML model.

This tutorial has several pages:

  1. Setting up your project and environment.

  2. Creating a text classification dataset .

  3. Training an AutoML text classification model.

  4. Deploy model to an endpoint and send a prediction.

  5. Cleaning up your project.

Each page assumes that you have already performed the instructions from the previous pages of the tutorial.

Setting up your project

Throughout this tutorial, use Google Cloud Console to interact with Google Cloud. Complete the following steps before using Vertex AI functionality.

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud Console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Make sure that billing is enabled for your Cloud project. Learn how to confirm that billing is enabled for your project.

  4. Enable the Vertex AI API.

    Enable the API

  5. Create a service account:

    1. In the Cloud Console, go to the Create service account page.

      Go to Create service account
    2. Select a project.
    3. In the Service account name field, enter a name. The Cloud Console fills in the Service account ID field based on this name.

      In the Service account description field, enter a description. For example, Service account for quickstart.

    4. Click Create.
    5. Click the Select a role field.

      Under Quick access, click Basic, then click Owner.

    6. Click Continue.
    7. Click Done to finish creating the service account.

      Do not close your browser window. You will use it in the next step.

  6. Create a service account key:

    1. In the Cloud Console, click the email address for the service account that you created.
    2. Click Keys.
    3. Click Add key, then click Create new key.
    4. Click Create. A JSON key file is downloaded to your computer.
    5. Click Close.
  7. Set the environment variable GOOGLE_APPLICATION_CREDENTIALS to the path of the JSON file that contains your service account key. This variable only applies to your current shell session, so if you open a new session, set the variable again.

  8. In the Cloud Console, activate Cloud Shell.

    Activate Cloud Shell

    At the bottom of the Cloud Console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Cloud SDK already installed, including the gcloud command-line tool, and with values already set for your current project. It can take a few seconds for the session to initialize.

Creating Cloud Storage bucket and copying the sample dataset

Create a Cloud Storage bucket to store the documents that you will use to train your AutoML model.

  1. Open Cloud Shell.

  2. Set the PROJECT_ID variable to the ID of your project.

  3. Set the BUCKET variable, which you will use to create a Cloud Storage bucket.

    export BUCKET=${PROJECT_ID}-lcm
  4. Create a Cloud Storage bucket in the us-central1 region with the BUCKET variable.

    gsutil mb -p ${PROJECT_ID} -l us-central1 gs://${BUCKET}/
  5. Copy the happiness.csv sample training dataset into your bucket.

    gsutil -m cp -R gs://cloud-ml-data/NL-classification/happiness.csv gs://${BUCKET}/text/

What's next

Follow the next page of this tutorial to use the Vertex AI console to create an image classification dataset and import the documents you copied to your Cloud Storage bucket.