Learn how you can import audio and transcripts with their metadata using the API.
Prerequisites
- Make sure that the Cloud Storage, Speech-to-Text, Data Loss Prevention, and Insights APIs are enabled on your Google Cloud project.
- Make sure that your conversation data is staged in a Cloud Storage
bucket. See the Cloud Storage documentation for information about creating a bucket.
- The Speech-to-Text and Contact Center AI Insights service agents must
have access to the objects in your Cloud Storage bucket. The email
address for the service agents use the following format, respectively:
service-PROJECT_NUMBER@gcp-sa-speech.iam.gserviceaccount.com
service-PROJECT_NUMBER@gcp-sa-contactcenterinsights.iam.gserviceaccount.com
- The Speech-to-Text and Contact Center AI Insights service agents must
have access to the objects in your Cloud Storage bucket. The email
address for the service agents use the following format, respectively:
- If you opt to import conversation metadata, ensure that metadata files are in their own
bucket and the metadata filenames match their corresponding conversation filename. For example, a conversation with the Cloud Storage URI
gs://transcript-bucket-name/conversation.mp3
, should have a corresponding metadata file such asgs://metadata-bucket-name/conversation.json
.
Import conversation data
Transcripts
Chat transcripts must be supplied as JSON-formatted files that match the CCAI conversation data format.
Voice transcripts can be supplied in the CCAI conversation data format or as the returned speech recognition result of a Speech-to-Text API transcription. The response is identical for synchronous and asynchronous recognition across all Speech-to-Text API versions.
Audio
Contact Center AI Insights uses Cloud Speech-to-Text batch recognition to transcribe audio. By default, transcription is done for the en-US language using the telephony model. Redaction is not done unless redaction configs are explicitly supplied in the project Settings or in the IngestConversationsRequest.
For detailed information on how to optionally configure transcription and redaction for bulk audio import, refer to the instructions for the single audio import API, which apply here.
Metadata
Conversation metadata files must be supplied as JSON-formatted files in a bucket specified in the gcs_source.metadata_bucket_uri
field of the IngestConversationsRequest
. CCAI Insights will populate conversation QualityMetadata if found in the file. Additionally, any custom metadata provided in the custom_metadata_keys
field of the request will be stored in the conversation labels. Up to 100 labels are supported per conversation.
See the following example of a valid metadata file containing both quality and custom metadata.
{ "customer_satisfaction_rating": 5, "agent_info": [ { "agent_id": "123456", "display_name": "Agent Name", "team": "Agent Team", "disposition_code": "resolved" } ], "custom_key": "custom value" }
Sample Commands
REST
Refer to the
conversations:ingest
API endpoint for complete details.
Before using any of the request data, make the following replacements:
- PROJECT_ID: your Google Cloud Platform project ID.
- GCS_BUCKET_URI: the Cloud Storage URI that points to the bucket containing the conversation transcripts. May contain a prefix. For example gs://BUCKET_NAME or gs://BUCKET_NAME/PREFIX. Wildcards are not supported.
- MEDIUM: set to either
PHONE_CALL
orCHAT
depending on the data type. If unspecified the default value isPHONE_CALL
. - AGENT_ID: Optional. Agent Id for the entire bucket.
HTTP method and URL:
POST https://contactcenterinsights.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/conversations:ingest
Request JSON body:
{ "gcsSource": { "bucketUri": "GCS_BUCKET_URI", "bucketObjectType": "AUDIO
" }, "transcriptObjectConfig": { "medium": "PHONE_CALL
" }, "conversationConfig": { "agentId": "AGENT_ID", "agentChannel": "AGENT_CHANNEL", "customerChannel": "CUSTOMER_CHANNEL" } } Or { "gcsSource": { "bucketUri": "GCS_BUCKET_URI", "bucketObjectType": "TRANSCRIPT
" }, "transcriptObjectConfig": { "medium": "MEDIUM" }, "conversationConfig": {"agentId": "AGENT_ID"} }
To send your request, expand one of these options:
You should receive a JSON response similar to the following:
{ "name": "projects/PROJECT_ID/locations/us-central1/operations/OPERATION_ID", "metadata": { "@type": "type.googleapis.com/google.cloud.contactcenterinsights.v1main.IngestConversationsMetadata", "createTime": "...", "request": { "parent": "projects/PROJECT_ID/locations/us-central1", "gcsSource": { "bucketUri": "GCS_BUCKET_URI", "bucketObjectType": "BUCKET_OBJECT_TYPE" }, "transcriptObjectConfig": { "medium": "MEDIUM" }, "conversationConfig": { "agentId": "AGENT_ID" } } } }
Poll the operation
An IngestConversations
request returns a long-running operation. Long-running methods are
asynchronous, and the operation might not yet be completed when the method returns a
response. You can poll the operation to check on its status. See the
long-running operations page
for details and code samples.
Cancel the operation
REST
Before using any of the request data, make the following replacements:
- PROJECT_ID: your Google Cloud Platform project ID.
- OPERATION_ID: the ID of the conversation you want to analyze. This value was returned when you created the operation.
HTTP method and URL:
POST https://contactcenterinsights.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/operations/OPERATION_ID:cancel
To send your request, expand one of these options:
You should receive a JSON response similar to the following:
{}