Upload single audio files using the API

Contact Center AI Insights lets you to upload dual-channel audio conversation data directly from Cloud Storage. These conversations can optionally be transcribed with custom settings, redacted, and analyzed. This is the process of transcribing, redacting, and automatically analyzing conversations using the UploadConversation REST API.

Prerequisites

Make sure that Cloud Storage, Speech-to-Text, Cloud Data Loss Prevention, and the Contact Center AI Insights APIs are enabled on your Google Cloud project. For details on enabling an API, see the before-you-begin guide.
Make sure that your dual-channel audio data is staged in a Cloud Storage bucket. Take note of the object path, which will be formatted as gs://<bucket>/<object>.
The Speech-to-Text service account requires access to your files on Cloud Storage. Replace the variable in the example with your Google Cloud project number:
```
service-PROJECT_NUMBER@gcp-sa-speech.iam.gserviceaccount.com
```

(Optional) Apply custom transcription settings, redaction, and analyses

Before your first audio upload to CCAI Insights, assess whether you would like to:

Use a custom Speech-to-Text transcription configuration.
Redact transcribed conversations.
Analyze the (optionally) redacted conversations.

You can configure these actions to run by default in each UploadConversation request by setting the proper fields in the project Settings resource. The speech and redaction settings can also be overridden per-request.

Prepare your Cloud Data Loss Prevention templates

Both Inspection templates and De-identification templates in the global region are supported.

Create a custom Speech-to-Text recognizer

CCAI Insights configures Speech-to-Text transcription settings with Recognizer resources. A recognizer with the name ccai-insights-recognizer is created in your project if you don't provide one in Settings or in the request. The CCAI Insights recognizer transcribes English speech using the telephony model.

For a full list of Speech-to-Text support per region, language, model, and recognition feature, refer to the Speech-to-Text language support docs.

Configure project settings

Permissions required for this task

To perform this task, you must have the following permissions:

contactcenterinsights.settings.update

Redaction, speech, and analysis percentage (0-100%) can be configured for all UploadConversation requests by setting the corresponding project settings parameters. Note that speech and redaction configurations can also be set individually per request, and will override the project settings.

Request JSON body:

{
  "redaction_config": {
    "deidentify_template": DEIDENTIFY_TEMPLATE_NAME
    "inspect_template": INSPECT_TEMPLATE_NAME
  }
  "speech_config": {
    "speech_recognizer": RECOGNIZER_NAME
  }
  "analysis_config": {
    "upload_conversation_analysis_percentage": ANALYSIS_PERCENTAGE
  }
}

Save the request body in a file called request.json, and execute the following command:

curl -X POST \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json; charset=utf-8" \
    -d @request.json \
    "https://contactcenterinsights.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/settings?updateMask=redaction_config,speech_config,analysis_config.upload_conversation_analysis_percentage"

Upload your conversation data

Permissions required for this task

To perform this task, you must have the following permissions:

contactcenterinsights.conversations.upload

The UploadConversation API creates a long-running operation that transcribes and optionally redacts your conversations. The conversation in the request is the same conversation that you would include if calling CreateConversation. An audio file will be transcribed if the conversation contains only an audio_uri in the DataSource, otherwise the provided transcript_uri will be read and used.

Request JSON body:

{ 
  "conversation": { 
    "data_source": { 
      "gcs_source": { "audio_uri": AUDIO_URI }
    }
  },
  "redaction_config": {
    "deidentify_template": DEIDENTIFY_TEMPLATE,
    "inspect_template": INSPECT_TEMPLATE
  },
  "speech_config": {
    "speech_recognizer": RECOGNIZER_NAME
  }
}

Save the request body in a file called request.json, and execute the following command:

curl -X POST \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json; charset=utf-8" \
    -d @request.json \
    "https://contactcenterinsights.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION_ID/conversations:upload"

The API will return a long-running operation if successful. See the long-running operations documentation for more details on working with operations.