Use the API to bulk import conversations

Learn how you can import audio and transcripts with their metadata using the API.

Prerequisites

  1. Make sure that the Cloud Storage, Speech-to-Text, Data Loss Prevention, and Insights APIs are enabled on your Google Cloud project.
  2. Make sure that your conversation data is staged in a Cloud Storage bucket. See the Cloud Storage documentation for information about creating a bucket.
    1. The Speech-to-Text and Contact Center AI Insights service agents must have access to the objects in your Cloud Storage bucket. The email address for the service agents use the following format, respectively:
      service-PROJECT_NUMBER@gcp-sa-speech.iam.gserviceaccount.com
      service-PROJECT_NUMBER@gcp-sa-contactcenterinsights.iam.gserviceaccount.com
  3. If you opt to import conversation metadata, ensure that metadata files are in their own bucket and the metadata filenames match their corresponding conversation filename. For example, a conversation with the Cloud Storage URI gs://transcript-bucket-name/conversation.mp3, should have a corresponding metadata file such as gs://metadata-bucket-name/conversation.json.

Import conversation data

Transcripts

Chat transcripts must be supplied as JSON-formatted files that match the CCAI conversation data format.

Voice transcripts can be supplied in the CCAI conversation data format or as the returned speech recognition result of a Speech-to-Text API transcription. The response is identical for synchronous and asynchronous recognition across all Speech-to-Text API versions.

Audio

Contact Center AI Insights uses Cloud Speech-to-Text batch recognition to transcribe audio. By default, transcription is done for the en-US language using the telephony model. Redaction is not done unless redaction configs are explicitly supplied in the project Settings or in the IngestConversationsRequest.

For detailed information on how to optionally configure transcription and redaction for bulk audio import, refer to the instructions for the single audio import API, which apply here.

Metadata

Conversation metadata files must be supplied as JSON-formatted files in a bucket specified in the gcs_source.metadata_bucket_uri field of the IngestConversationsRequest. CCAI Insights will populate conversation QualityMetadata if found in the file. Additionally, any custom metadata provided in the custom_metadata_keys field of the request will be stored in the conversation labels. Up to 100 labels are supported per conversation.

See the following example of a valid metadata file containing both quality and custom metadata.

{
  "customer_satisfaction_rating": 5,
  "agent_info": [
    {
      "agent_id": "123456",
      "display_name": "Agent Name",
      "team": "Agent Team",
      "disposition_code": "resolved"
    }
  ],
  "custom_key": "custom value"
}

Sample Commands

REST

Refer to the conversations:ingest API endpoint for complete details.

Before using any of the request data, make the following replacements:

  • PROJECT_ID: your Google Cloud Platform project ID.
  • GCS_BUCKET_URI: the Cloud Storage URI that points to the bucket containing the conversation transcripts. May contain a prefix. For example gs://BUCKET_NAME or gs://BUCKET_NAME/PREFIX. Wildcards are not supported.
  • MEDIUM: set to either PHONE_CALL or CHAT depending on the data type. If unspecified the default value is PHONE_CALL.
  • AGENT_ID: Optional. Agent Id for the entire bucket.

HTTP method and URL:

POST https://contactcenterinsights.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/conversations:ingest

Request JSON body:

{
  "gcsSource":  {
    "bucketUri": "GCS_BUCKET_URI",
    "bucketObjectType": "AUDIO"
  },
  "transcriptObjectConfig": { "medium": "PHONE_CALL" },
  "conversationConfig": {
    "agentId": "AGENT_ID",
    "agentChannel": "AGENT_CHANNEL",
    "customerChannel": "CUSTOMER_CHANNEL"
  }
}

Or

{
  "gcsSource":  {
    "bucketUri": "GCS_BUCKET_URI",
    "bucketObjectType": "TRANSCRIPT"
  },
  "transcriptObjectConfig": { "medium": "MEDIUM" },
  "conversationConfig": {"agentId": "AGENT_ID"}
}

To send your request, expand one of these options:

You should receive a JSON response similar to the following:


{
  "name": "projects/PROJECT_ID/locations/us-central1/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.contactcenterinsights.v1main.IngestConversationsMetadata",
    "createTime": "...",
    "request": {
      "parent": "projects/PROJECT_ID/locations/us-central1",
      "gcsSource": {
        "bucketUri": "GCS_BUCKET_URI",
        "bucketObjectType": "BUCKET_OBJECT_TYPE"
      },
      "transcriptObjectConfig": {
        "medium": "MEDIUM"
      },
      "conversationConfig": {
        "agentId": "AGENT_ID"
      }
    }
  }
}

Poll the operation

An IngestConversations request returns a long-running operation. Long-running methods are asynchronous, and the operation might not yet be completed when the method returns a response. You can poll the operation to check on its status. See the long-running operations page for details and code samples.

Cancel the operation

REST

Before using any of the request data, make the following replacements:

  • PROJECT_ID: your Google Cloud Platform project ID.
  • OPERATION_ID: the ID of the conversation you want to analyze. This value was returned when you created the operation.

HTTP method and URL:

POST https://contactcenterinsights.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/operations/OPERATION_ID:cancel

To send your request, expand one of these options:

You should receive a JSON response similar to the following:

{}