Hello text data: Set up your project and environment

If you plan to use the Vertex AI SDK for Python, make sure that the service account initializing the client has the Vertex AI Service Agent (roles/aiplatform.serviceAgent) IAM role.

For this part of the tutorial, you set up your Google Cloud project to use Vertex AI and a Cloud Storage bucket that contains the documents for training your AutoML model.

This tutorial has several pages:

  1. Setting up your project and environment.

  2. Creating a text classification dataset .

  3. Training an AutoML text classification model.

  4. Deploy model to an endpoint and send a prediction.

  5. Cleaning up your project.

Each page assumes that you have already performed the instructions from the previous pages of the tutorial.

Set up your project and environment

Complete the following steps before using Vertex AI functionality.

  1. In the Google Cloud console, go to the project selector page.

    Go to project selector

  2. Select or create a Google Cloud project.

  3. Make sure that billing is enabled for your Google Cloud project.

  4. Open Cloud Shell. Cloud Shell is an interactive shell environment for Google Cloud that lets you manage your projects and resources from your web browser.
  5. Go to Cloud Shell
  6. In the Cloud Shell, set the current project to your Google Cloud project ID and store it in the projectid shell variable:
      gcloud config set project PROJECT_ID &&
      projectid=PROJECT_ID &&
      echo $projectid
    Replace PROJECT_ID with your project ID. You can locate your project ID in the Google Cloud console. For more information, see Find your project ID.
  7. Enable the IAM, Compute Engine, Notebooks, Cloud Storage, and Vertex AI APIs:

    gcloud services enable iam.googleapis.com  compute.googleapis.com notebooks.googleapis.com storage.googleapis.com aiplatform.googleapis.com
  8. Grant roles to your user account. Run the following command once for each of the following IAM roles: roles/aiplatform.user, roles/storage.admin

    gcloud projects add-iam-policy-binding PROJECT_ID --member="user:USER_IDENTIFIER" --role=ROLE
    • Replace PROJECT_ID with your project ID.
    • Replace USER_IDENTIFIER with the identifier for your user account. For example, user:myemail@example.com.

    • Replace ROLE with each individual role.
  9. The Vertex AI User (roles/aiplatform.user) IAM role provides access to use all resources in Vertex AI. The Storage Admin (roles/storage.admin) lets you store the document's training dataset in Cloud Storage.

Create a Cloud Storage bucket and copy the sample dataset

Create a Cloud Storage bucket to store the documents that you use to train your AutoML model.

  1. Open Cloud Shell.

  2. Set the PROJECT_ID variable to the ID of your project.

    export PROJECT_ID=PROJECT_ID
  3. Set the BUCKET variable, which you use to create a Cloud Storage bucket.

    export BUCKET=${PROJECT_ID}-lcm
  4. Create a Cloud Storage bucket in the us-central1 region with the BUCKET variable.

    gcloud storage buckets create gs://${BUCKET}/ --project=${PROJECT_ID} --location=us-central1
  5. Copy the happiness.csv sample training dataset into your bucket.

    gcloud storage cp gs://cloud-ml-data/NL-classification/happiness.csv gs://${BUCKET}/text/ --recursive

What's next

Follow the next page of this tutorial to use the Vertex AI console to create a text classification dataset and import the documents you copied to your Cloud Storage bucket.