Import conversations using client tooling

CCAI Insights supports conversation creation and analysis using the Console and by using the REST API. Once conversations are uploaded, you can view them along with their respective analysis results in the Console. In order to make uploading a large number of conversations more efficient, CCAI Insights provides customized tooling (a Python script) that allows you to perform a bulk upload.

Prerequisites

  1. Make sure that the Storage, Speech-to-Text and Insights APIs are enabled on your Google Cloud Platform project.
  2. Create and analyze a conversation using the Insights API.
  3. Set up the Google Cloud SDK.
  4. Set up service account authentication.

Access the import tool

You can access the import tooling in two ways:

  1. You can download it from a Cloud Storage folder.

    • You can also download the folder using the gsutil tool:
    gsutil cp -r gs://dialogflow-docs-downloads/contact-center-insights-preview .
    
  2. You can use git. If you have never used git before, you might need to first set it up. To access the client tooling, enter the following command:

    git clone "https://partner-code.googlesource.com/ccai-insights-client"
    

Set up your development environment

The import tooling has some Python package dependencies that must be installed. Run the following commands from the directory where you have saved the downloaded tooling:

python3 -m pip install --user --upgrade pip
pip install -r requirements.txt

Set up import tool inputs

The tooling expects a Google Cloud Storage bucket containing the conversations that you want to upload into CCAI Insights. The inputs can be one of the following:

For audio files, the tooling will automatically transcribe the files and place the resulting transcripts in the specified dest_gcs_bucket.

For each audio and transcript file, a conversation will be created in CCAI Insights and analyzed.

Run the import tool

Import tool usage:

usage: import_conversations.py [-h]
       (--source_local_audio_path SOURCE_LOCAL_AUDIO_PATH | \
        --source_audio_gcs_bucket SOURCE_AUDIO_GCS_BUCKET | \
        --source_voice_transcript_gcs_bucket SOURCE_VOICE_TRANSCRIPT_GCS_BUCKET | \
        --source_chat_transcript_gcs_bucket SOURCE_CHAT_TRANSCRIPT_GCS_BUCKET)
       [--dest_gcs_bucket DEST_GCS_BUCKET]
       [--impersonated_service_account IMPERSONATED_SERVICE_ACCOUNT]
       [--redact REDACT]
       [--analyze ANALYZE]
       [--insights_endpoint INSIGHTS_ENDPOINT]
       [--language_code LANGUAGE_CODE]
       [--encoding ENCODING]
       [--sample_rate_hertz SAMPLE_RATE_HERTZ]
       [--agent_id AGENT_ID]
       PROJECT

An example command for importing conversations from a Google Cloud Storage bucket containing audio files:

python3 import_conversations.py
        --source_audio_gcs_bucket my_audios_bucket
        --dest_gcs_bucket my_transcripts_bucket
        --encoding MP3
        --sample_rate 44100
        my-project-id

An example command for importing conversations from a Google Cloud Storage bucket containing chat transcripts:

python3 import_conversations.py
        --source_chat_transcript_gcs_bucket my_chat_transcripts_bucket
        my-project-id

An example command for importing conversations from a Google Cloud Storage bucket containing voice transcripts:

python3 import_conversations.py
        --source_voice_transcript_gcs_bucket my_voice_transcripts_bucket
        my-project-id

If you are using an impersonated service account, be sure to set up your ability to impersonate that account beforehand. Then, before running the tool, set your default gcloud credential to your user credential by running:

gcloud auth application-default login