このページは Cloud Translation API によって翻訳されました。

Speech-to-Text モデルの適応を設定する

Agent Assist は、Speech-to-Text モデル適応を使用して、特定のフレーズを他のフレーズよりも高い頻度で認識することで、音声文字変換の品質を向上させます。このページでは、Speech-to-Text 音声文字変換のモデル適応を設定するガイドについて説明します。

Speech-to-Text コンソールを使用する

Speech-to-Text コンソールで作成できるのは、グローバルなフレーズセットのみです。地域固有のフレーズセットは、Speech-to-Text API を使用して作成する必要があります。

Google Cloud コンソールで、[Speech-to-Text] ページに移動します。Speech-to-Text に移動
[モデルの適応] をクリックします。
add_box [New Resource] をクリックします。
フレーズセット リソースと API バージョン V1 を選択し、フレーズとブースト値を入力して、フレーズセットの名前をコピーします。
[保存] をクリックします。
Agent Assist コンソールに移動します。
[会話プロファイル] をクリックし、編集する会話プロファイルを選択します。
[フレーズセット] セクションに移動し、フレーズセット名を貼り付けます。

Speech-to-Text API を使用する

音声認識の手順に沿って、フレーズセットスクリプトを作成します。

次の Python スクリプトを実行して、会話プロファイルを更新します。

# Conversation Profile to update
PROJECT_ID = "sample-project"
LOCATION = "global"
CONVERSATION_PROFILE_ID = "sample-conversation-profile"
# Speech model adaptation resource names
SPEECH_ADAPTATION_PHRASES = ["projects/sample-project/locations/global/phraseSets/sample-phrase-sets"]

import google.auth
from google.auth.transport.requests import AuthorizedSession

scopes=['https://www.googleapis.com/auth/cloud-platform']
credentials, project = google.auth.default(
  scopes=scopes,
  quota_project_id=PROJECT_ID,
)
session = AuthorizedSession(credentials)

profile_url = f"https://dialogflow.googleapis.com/v2beta1/projects/{PROJECT_ID}/locations/{LOCATION}/conversationProfiles/{CONVERSATION_PROFILE_ID}"
get_response = session.get(profile_url)
print("Checking for existing ConversationProfile...")
print(get_response.status_code)
print(get_response.json())
if get_response.status_code == 200:
  patch_response = session.patch(
     profile_url,
     params={
        "updateMask": "sttConfig.phraseSets"
     },
     json={
        "sttConfig": {
           "phraseSets": SPEECH_ADAPTATION_PHRASES
        }
     }
  )
  print("Updating ConversationProfile...")
  print(patch_response.status_code)
  print(patch_response.json())

地域別のフレーズセット

Speech-to-Text モデルの適応は英語（en-US）のみをサポートしていますが、Speech-to-Text API を使用して他の言語 / 地域のフレーズセットを構成できます。これは、これらの地域で行われた英語の会話を文字起こしする場合に特に便利です。

次のサンプルコマンドを使用して、Speech-to-Text API で地域固有のフレーズセットを作成します。

curl -X POST \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json; charset=utf-8" \
    -H "X-Goog-User-Project: sample_project" \
    -d @sample_phrase_sets.json \
"https://us-speech.googleapis.com/v1/projects/sample-project/locations/us/phraseSets"

json ファイル @sample_phrase_sets.json には、フレーズセットの次の内容が含まれています。

{
  "parent": "projects/sample-project/locations/us",
  "phraseSetId": "sample-phrase-sets",
  "phraseSet": {
    "name": "sample-phrase-sets",
    "phrases": [
      {
        "value": "Some phrase",
        "boost": 20
      }
    ]
  }
}

単一の Dialogflow リージョンの会話プロファイルの場合、次の表に、フレーズセットを作成する Speech-to-Text リージョンを示します。

Dialogflow のリージョン	Speech-to-Text の地域
us us-central1 us-east1 us-east7 us-west1 northamerica-northeast1 northamerica-northeast2	私たち
eu europe-west1 europe-west2 europe-west3 europe-west4	eu
australia-southeast1 asia-northeast1 asia-south1 asia-southeast1 me-west1 global	グローバル