Use the API to import conversations

Learn how you can import audio and transcript files with their metadata using the API. You can import a single file using the UploadConversation API, or you can bulk import all the files from a Cloud Storage bucket using the IngestConversations API.

The two request commands UploadConversation and IngestConversations support the following functions:

Request command Number of files Speech-to-Text Redaction Metadata ingestion Automatic analysis
UploadConversation 1 ✔ metadata in the request
IngestConversations All files in a bucket ✔ metadata in the request

Prerequisites

  1. Enable the Cloud Storage, Speech-to-Text, Cloud Data Loss Prevention, and Contact Center AI Insights APIs on your Google Cloud project.
  2. Save your conversation data (dual-channel audio and transcript files) in a Cloud Storage bucket. Note the object path with the following format: gs://<bucket>/<object>
  3. Give the Speech-to-Text and Contact Center AI Insights service agents access to the objects in your Cloud Storage bucket. See this troubleshooting page for help with service accounts.
  4. If you opt to import conversation metadata, ensure that metadata files are in their own bucket and the metadata filenames match their corresponding conversation filename.

    For example, a conversation with the Cloud Storage URI gs://transcript-bucket-name/conversation.mp3, must have a corresponding metadata file such as gs://metadata-bucket-name/conversation.json.

Conversation data

Conversation data consists of voice or chat transcripts and audio.

Transcripts

Chat transcripts must be supplied as JSON-formatted files that match the CCAI conversation data format.

Voice transcripts can be supplied in the CCAI conversation data format or as the returned speech recognition result of a Speech-to-Text API transcription. The response is identical for synchronous and asynchronous recognition across all Speech-to-Text API versions.

Audio

Contact Center AI Insights uses Cloud Speech-to-Text batch recognition to transcribe audio. CCAI Insights configures Speech-to-Text transcription settings with Recognizer resources. You can create a custom recognizer in the request, or if you don't provide a recognizer either in Settings or in the request, CCAI Insights creates a default ccai-insights-recognizer in your project.

The CCAI Insights recognizer transcribes English speech using the telephony model, and the default language is en-US. For a full list of Speech-to-Text support per region, language, model, and recognition feature, refer to the Speech-to-Text language support docs.

Before your first audio import to CCAI Insights, assess whether you would like to:

  • Use a custom Speech-to-Text transcription configuration.
  • Analyze the (optionally) redacted conversations.

You can configure these actions to run by default in each UploadConversation or IngestConversation request by setting the proper fields in the project Settings resource. The speech and redaction settings can also be overridden per-request. If you don't specify any speech settings, CCAI Insights will use the default speech settings and won't redact the transcripts.

Redaction

Cloud Data Loss Prevention does not redact transcripts unless you explicitly supply redaction configs in the project Settings, the UploadConversationRequest, or in the IngestConversationsRequest. Cloud Data Loss Prevention supports both inspection templates and de-identification templates for redaction.

Configure project settings

Redaction and speech can be configured for UploadConversation and IngestConversations requests by setting the corresponding project settings parameters. These configurations can also be set individually per request, which overrides the project settings. UploadConversation also supports analysis percentage configuration, thoughIngestConversations does not.

Save the request body in a file called request.json, and execute the following command:

curl -X POST \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json; charset=utf-8" \
    -d @request.json \
    "https://contactcenterinsights.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/settings?updateMask=redaction_config,speech_config,analysis_config.upload_conversation_analysis_percentage"

Metadata

For a single file import, include your QualityMetadata in the curl command for UploadConversationsRequest.

curl --request POST \
  'https://contactcenterinsights.googleapis.com/v1/projects/project-id/locations/location-id/conversations:upload' \
  --header 'Authorization: Bearer [YOUR_ACCESS_TOKEN]' \
  --header 'Accept: application/json' \
  --header 'Content-Type: application/json' \
  --data '{"conversation":{"qualityMetadata":{"agentInfo":[{"agentId":"agent-id","displayName":"agent-name"}]},"dataSource":{"gcsSource":{"transcriptUri":"gs://path/to_transcript"}}}}'

For a bulk import, you must supply conversation metadata files as JSON-formatted files in a bucket specified in the gcs_source.metadata_bucket_uri field of the IngestConversationsRequest. CCAI Insights populates conversation QualityMetadata found in the file. If you provide any custom metadata in the custom_metadata_keys field of the request, CCAI Insights stores that custom metadata in the conversation labels and supports up to 100 labels.

See the following example of a valid metadata file containing both quality and custom metadata.

{
  "customer_satisfaction_rating": 5,
  "agent_info": [
    {
      "agent_id": "123456",
      "display_name": "Agent Name",
      "team": "Agent Team",
      "disposition_code": "resolved"
    }
  ],
  "custom_key": "custom value"
}

Import a single audio file

The UploadConversation API creates a long-running operation that transcribes and optionally redacts your conversations. An audio file will be transcribed if the conversation contains only an audio_uri in the DataSource. Otherwise, the provided transcript_uri will be read and used.

Request JSON body:

{ 
  "conversation": { 
    "data_source": { 
      "gcs_source": { "audio_uri": AUDIO_URI }
    }
  },
  "redaction_config": {
    "deidentify_template": DEIDENTIFY_TEMPLATE,
    "inspect_template": INSPECT_TEMPLATE
  },
  "speech_config": {
    "speech_recognizer": RECOGNIZER_NAME
  }
}

Save the request body in a file called request.json, and execute the following command:

curl -X POST \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json; charset=utf-8" \
    -d @request.json \
    "https://contactcenterinsights.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/conversations:upload"

Bulk import

REST

Refer to the conversations:ingest API endpoint for complete details.

Before using any of the request data, make the following replacements:

  • PROJECT_ID: your Google Cloud Platform project ID.
  • GCS_BUCKET_URI: the Cloud Storage URI that points to the bucket containing the conversation transcripts. May contain a prefix. For example gs://BUCKET_NAME or gs://BUCKET_NAME/PREFIX. Wildcards are not supported.
  • MEDIUM: set to either PHONE_CALL or CHAT depending on the data type. If unspecified the default value is PHONE_CALL.
  • AGENT_ID: Optional. Agent Id for the entire bucket.

HTTP method and URL:

POST https://contactcenterinsights.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/conversations:ingest

Request JSON body:

{
  "gcsSource":  {
    "bucketUri": "GCS_BUCKET_URI",
    "bucketObjectType": "AUDIO"
  },
  "transcriptObjectConfig": { "medium": "PHONE_CALL" },
  "conversationConfig": {
    "agentId": "AGENT_ID",
    "agentChannel": "AGENT_CHANNEL",
    "customerChannel": "CUSTOMER_CHANNEL"
  }
}

Or

{
  "gcsSource":  {
    "bucketUri": "GCS_BUCKET_URI",
    "bucketObjectType": "TRANSCRIPT"
  },
  "transcriptObjectConfig": { "medium": "MEDIUM" },
  "conversationConfig": {"agentId": "AGENT_ID"}
}

To send your request, expand one of these options:

You should receive a JSON response similar to the following:


{
  "name": "projects/PROJECT_ID/locations/us-central1/operations/OPERATION_ID",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.contactcenterinsights.v1main.IngestConversationsMetadata",
    "createTime": "...",
    "request": {
      "parent": "projects/PROJECT_ID/locations/us-central1",
      "gcsSource": {
        "bucketUri": "GCS_BUCKET_URI",
        "bucketObjectType": "BUCKET_OBJECT_TYPE"
      },
      "transcriptObjectConfig": {
        "medium": "MEDIUM"
      },
      "conversationConfig": {
        "agentId": "AGENT_ID"
      }
    }
  }
}

Poll the operation

Both the UploadConversation and IngestConversation requests return a long-running operation. Long-running methods are asynchronous, and the operation might not yet be completed when the method returns a response. You can poll the operation to check on its status. See the long-running operations page for details and code samples.