Hello text data: Set up your project and environment

This tutorial demonstrates how to create a model for classifying content using Vertex AI. The tutorial trains an AutoML model by using a corpus of crowd-sourced "happy moments" from the Kaggle open-source dataset HappyDB. The resulting model classifies happy moments into categories reflecting the causes of happiness.

For this part of the tutorial, you will set up your Google Cloud project to use Vertex AI and a Cloud Storage bucket that will contain the documents for training your AutoML model.

This tutorial has several pages:

  1. Setting up your project and environment.

  2. Creating a text classification dataset .

  3. Training an AutoML text classification model.

  4. Deploy model to an endpoint and send a prediction.

  5. Cleaning up your project.

Each page assumes that you have already performed the instructions from the previous pages of the tutorial.

Set up your project

Throughout this tutorial, use Google Cloud console to interact with Google Cloud. Complete the following steps before using Vertex AI functionality.

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Make sure that billing is enabled for your Cloud project. Learn how to check if billing is enabled on a project.

  4. Enable the Vertex AI API.

    Enable the API

  5. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  6. Make sure that billing is enabled for your Cloud project. Learn how to check if billing is enabled on a project.

  7. Enable the Vertex AI API.

    Enable the API

  8. In the Cloud console, activate Cloud Shell.

    Activate Cloud Shell

    At the bottom of the Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.

Create a Cloud Storage bucket and copy the sample dataset

Create a Cloud Storage bucket to store the documents that you will use to train your AutoML model.

  1. Open Cloud Shell.

  2. Set the PROJECT_ID variable to the ID of your project.

    export PROJECT_ID=PROJECT_ID
  3. Set the BUCKET variable, which you will use to create a Cloud Storage bucket.

    export BUCKET=${PROJECT_ID}-lcm
  4. Create a Cloud Storage bucket in the us-central1 region with the BUCKET variable.

    gsutil mb -p ${PROJECT_ID} -l us-central1 gs://${BUCKET}/
  5. Copy the happiness.csv sample training dataset into your bucket.

    gsutil -m cp -R gs://cloud-ml-data/NL-classification/happiness.csv gs://${BUCKET}/text/

What's next

Follow the next page of this tutorial to use the Vertex AI console to create a text classification dataset and import the documents you copied to your Cloud Storage bucket.