Create a topic model

Stay organized with collections Save and categorize content based on your preferences.

Prerequisites

  1. Complete the instructions on the before you begin page.
  2. Make sure that the roles assigned to your service account allow write access to the project that you intended to use for topic modeling (Project > Owner or Project > Editor). The service account should also allow read access to the Cloud Storage API.
  3. Make sure that the Data Labeling API is enabled in your project. For details on enabling an API, see the cloud endpoints guide.

Import conversation data

To create a topic model you will need at least 10k conversations, otherwise the request will be rejected. For information on bulk importing conversation data, see the client tooling documentation.

You can provide your conversation data as either audio data (like phone call recordings) or as JSON-formatted text files. For details on the format and instructions for uploading it to Cloud Storage, see the conversation data reference.

You can upload these example conversation files to Cloud Storage:

Training data best practices

  1. Make sure all the transcripts are mostly in English (a few non-English words/sentences are fine). Topic modeling currently supports English conversations only.

  2. Make sure that the conversation's speaker roles are assigned properly when the conversation is ingested. Each conversation turn should be accurately labeled as coming from either the customer or the agent. For messages from agents, specify whether the message is from a human agent or bot/system agents. Use AGENT for human agent roles, use AUTOMATED_AGENT for bot agents. You can use either END_USER or CUSTOMER for customer roles.

  3. Make sure most conversations have transcripts from both customer and agent channels. Conversations with only one channel won't be used in training.

  4. We recommend that you check the Cloud Data Loss Prevention redaction quality, if applicable. Sometimes the redaction is overly aggressive and removes important information from the transcripts.

  5. Provide 100k or more transcripts for training. The system works with a smaller number of transcripts, but more transcripts lead to better performance.

Create a model

To create a new model, you must define your model and send a creation request to the CCAI Insights API. In the definition for your model, you must provide the following information:

  • A display name for your model.
  • A training information configuration for your data. You can specify to use either CHAT data or PHONE_CALL data, depending on the data source of your chat transcripts. By default CCAI Insights will use all conversations in your Google Cloud Platform project to create the topic model.

REST

To create a topic model, call the create method on the issueModel resource.

Before using any of the request data, make the following replacements:

  • PROJECT_ID: your GCP project ID.
  • LOCATION_ID: the location you chose for your Cloud Storage bucket. The only location currently available is us-central1.
  • MODEL_NAME: a human-readable name for the new issue model.

HTTP method and URL:

POST https://contactcenterinsights.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/issueModels

Request JSON body:

{
  "display_name": MODEL_NAME,
  "input_data_config": {
      "filter": "medium=\"CHAT\""
   }
}

To send your request, expand one of these options:

You should receive a JSON response similar to the following:

{
  "name": "projects/PROJECT_ID/locations/us-central1/operations/OPERATION_ID"
}

Python

from google.cloud import contact_center_insights_v1


def create_issue_model(project_id: str) -> contact_center_insights_v1.IssueModel:
    """Creates an issue model.

    Args:
        project_id:
            The project identifier. For example, 'my-project'.

    Returns:
        An issue model.
    """
    # Construct a parent resource.
    parent = (
        contact_center_insights_v1.ContactCenterInsightsClient.common_location_path(
            project_id, "us-central1"
        )
    )

    # Construct an issue model.
    issue_model = contact_center_insights_v1.IssueModel()
    issue_model.display_name = "my-model"
    issue_model.input_data_config.filter = 'medium="CHAT"'

    # Call the Insights client to create an issue model.
    insights_client = contact_center_insights_v1.ContactCenterInsightsClient()
    issue_model_operation = insights_client.create_issue_model(
        parent=parent, issue_model=issue_model
    )

    issue_model = issue_model_operation.result(timeout=86400)
    print(f"Created an issue model named {issue_model.name}")
    return issue_model

Java


import com.google.cloud.contactcenterinsights.v1.ContactCenterInsightsClient;
import com.google.cloud.contactcenterinsights.v1.IssueModel;
import com.google.cloud.contactcenterinsights.v1.LocationName;
import java.io.IOException;

public class CreateIssueModel {

  public static void main(String[] args) throws Exception, IOException {
    // TODO(developer): Replace this variable before running the sample.
    String projectId = "my_project_id";

    createIssueModel(projectId);
  }

  public static IssueModel createIssueModel(String projectId) throws Exception, IOException {
    // Initialize client that will be used to send requests. This client only needs to be created
    // once, and can be reused for multiple requests. After completing all of your requests, call
    // the "close" method on the client to safely clean up any remaining background resources.
    try (ContactCenterInsightsClient client = ContactCenterInsightsClient.create()) {
      // Construct a parent resource.
      LocationName parent = LocationName.of(projectId, "us-central1");

      // Construct an issue model.
      IssueModel issueModel =
          IssueModel.newBuilder()
              .setDisplayName("my-model")
              .setInputDataConfig(
                  IssueModel.InputDataConfig.newBuilder().setFilter("medium=\"CHAT\"").build())
              .build();

      // Call the Insights client to create an issue model.
      IssueModel response = client.createIssueModelAsync(parent, issueModel).get();
      System.out.printf("Created %s%n", response.getName());
      return response;
    }
  }
}

Node.js

/**
 * TODO(developer): Uncomment this variable before running the sample.
 */
// const projectId = 'my_project_id';

// Imports the Contact Center Insights client.
const {
  ContactCenterInsightsClient,
} = require('@google-cloud/contact-center-insights');

// Instantiates a client.
const client = new ContactCenterInsightsClient();

async function createIssueModel() {
  const [operation] = await client.createIssueModel({
    parent: client.locationPath(projectId, 'us-central1'),
    issueModel: {
      displayName: 'my-model',
      inputDataConfig: {
        filter: 'medium="CHAT"',
      },
    },
  });

  // Wait for the operation to complete.
  const [issueModel] = await operation.promise();
  console.info(`Created ${issueModel.name}`);
}
createIssueModel();

Operation status

Creating a topic model is a long-running operation, so it might take a substantial amount of time to complete. You can poll the status of the operation to see if it has completed.