本頁面由 Cloud Translation API 翻譯而成。

利用音訊輸入檔案偵測意圖

本指南說明如何使用 API，將音訊輸入內容傳送至偵測意圖要求。Dialogflow 會處理音訊並將其轉換為文字，接著再嘗試比對意圖。這項轉換作業稱為「音訊輸入」、「語音辨識」、「語音轉文字」或「STT」。

事前準備

這項功能僅適用於使用 API 進行使用者互動的情況。如果您使用的是整合項目，可以略過本指南。

閱讀本指南之前，請先完成下列工作：

詳閱 Dialogflow 基本概念。
執行設定步驟。

建立虛擬服務專員

如果尚未建立代理程式，請立即建立：

前往 Dialogflow ES 主控台。
按照系統要求登入 Dialogflow 主控台。詳情請參閱 Dialogflow 主控台總覽。
按一下左側欄選單中的 [Create Agent] (建立代理程式)。(如果您已有其他代理程式，請按一下代理程式名稱然後捲動至底部，再按一下 [Create new agent] (建立新代理程式)。)
輸入代理程式的名稱、預設語言和預設時區。
如果您已建立專案，請輸入該項專案的資料。如要允許 Dialogflow 主控台建立專案，請選取 [Create a new Google project] (建立新 Google 專案)。
按一下 [Create] (建立) 按鈕。

將範例檔案匯入代理程式

本指南中的步驟會假設您的代理程式符合某些條件，因此您需要匯入為本指南準備的代理程式。匯入時，這些步驟會使用「還原」選項，覆寫所有代理程式設定、意圖和實體。

如要匯入檔案，請按照下列步驟操作：

下載 room-booking-agent.zip 檔案。
前往 Dialogflow ES 主控台。
選取代理程式。
按一下代理程式名稱旁邊的設定按鈕。
選取「匯出與匯入」分頁標籤。
選取「從 ZIP 檔案還原」然後按照操作說明還原您下載的 ZIP 檔案。

偵測意圖

如要偵測意圖，請呼叫 Sessions 類型的 detectIntent 方法。

REST

下載 book-a-room.wav 範例輸入音訊檔案，其內容為「book a room」(預訂會議室)。這個範例音訊檔案必須採用 Base64 編碼，才能透過下方的 JSON 要求提供。以下是 Linux 範例：

wget https://cloud.google.com/dialogflow/es/docs/data/book-a-room.wav
base64 -w 0 book-a-room.wav > book-a-room.b64

如需其他平台的範例，請參閱 Cloud Speech-to-Text API 說明文件中的「Base64 編碼音訊內容」一文。

使用任何要求資料之前，請先替換以下項目：

PROJECT_ID：您的 Google Cloud 專案 ID
AUDIO：Base64 編碼音訊內容

HTTP 方法和網址：

POST https://dialogflow.googleapis.com/v2/projects/PROJECT_ID/agent/sessions/123456789:detectIntent

JSON 要求主體：

{
  "queryInput": {
    "audioConfig": {
      "languageCode": "en-US"
    }
  },
  "inputAudio": "AUDIO"
}

如要傳送要求，請展開以下其中一個選項：

curl (Linux、macOS 或 Cloud Shell)

注意： 下列指令假設您已執行 gcloud init 或 gcloud auth login，透過使用者帳戶登入 gcloud CLI，或使用 Cloud Shell，自動登入 gcloud CLI。您可以執行 gcloud auth list 查看目前有效的帳戶。

將要求主體儲存在名為 request.json 的檔案中，然後執行下列指令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "x-goog-user-project: PROJECT_ID" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://dialogflow.googleapis.com/v2/projects/PROJECT_ID/agent/sessions/123456789:detectIntent"

PowerShell (Windows)

注意： 下列指令假設您已執行 gcloud init 或 gcloud auth login，透過使用者帳戶登入 gcloud CLI。您可以執行 gcloud auth list 查看目前有效的帳戶。

將要求主體儲存在名為 request.json 的檔案中，然後執行下列指令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred"; "x-goog-user-project" = "PROJECT_ID" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://dialogflow.googleapis.com/v2/projects/PROJECT_ID/agent/sessions/123456789:detectIntent" | Select-Object -Expand Content

APIs Explorer (瀏覽器)

複製要求內文並開啟方法參考資料頁面。系統會在頁面右側開啟 APIs Explorer 面板。您可以使用這項工具來傳送要求。將要求內文貼到這項工具中，並填妥其他必填欄位，然後按一下「Execute」(執行)。

您應該會收到如下的 JSON 回應：

{
  "responseId": "3c1e5a89-75b9-4c3f-b63d-4b1351dd5e32",
  "queryResult": {
    "queryText": "book a room",
    "action": "room.reservation",
    "parameters": {
      "time": "",
      "date": "",
      "guests": "",
      "duration": "",
      "location": ""
    },
    "fulfillmentText": "I can help with that. Where would you like to reserve a room?",
    "fulfillmentMessages": [
      {
        "text": {
          "text": [
            "I can help with that. Where would you like to reserve a room?"
          ]
        }
      }
    ],
    "intent": {
      "name": "projects/PROJECT_ID/agent/intents/e8f6a63e-73da-4a1a-8bfc-857183f71228",
      "displayName": "room.reservation"
    },
    "intentDetectionConfidence": 1,
    "diagnosticInfo": {},
    "languageCode": "en-us"
  }
}

請注意，queryResult.action 欄位的值為「room.reservation」，而 queryResult.fulfillmentMessages[0|1].text.text[0] 欄位的值會要求使用者提供更多資訊。

Go

如要向 Dialogflow 進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證」。

func DetectIntentAudio(projectID, sessionID, audioFile, languageCode string) (string, error) {
	ctx := context.Background()

	sessionClient, err := dialogflow.NewSessionsClient(ctx)
	if err != nil {
		return "", err
	}
	defer sessionClient.Close()

	if projectID == "" || sessionID == "" {
		return "", fmt.Errorf("detect.DetectIntentAudio empty project (%s) or session (%s)", projectID, sessionID)
	}

	sessionPath := fmt.Sprintf("projects/%s/agent/sessions/%s", projectID, sessionID)

	// In this example, we hard code the encoding and sample rate for simplicity.
	audioConfig := dialogflowpb.InputAudioConfig{AudioEncoding: dialogflowpb.AudioEncoding_AUDIO_ENCODING_LINEAR_16, SampleRateHertz: 16000, LanguageCode: languageCode}

	queryAudioInput := dialogflowpb.QueryInput_AudioConfig{AudioConfig: &audioConfig}

	audioBytes, err := os.ReadFile(audioFile)
	if err != nil {
		return "", err
	}

	queryInput := dialogflowpb.QueryInput{Input: &queryAudioInput}
	request := dialogflowpb.DetectIntentRequest{Session: sessionPath, QueryInput: &queryInput, InputAudio: audioBytes}

	response, err := sessionClient.DetectIntent(ctx, &request)
	if err != nil {
		return "", err
	}

	queryResult := response.GetQueryResult()
	fulfillmentText := queryResult.GetFulfillmentText()
	return fulfillmentText, nil
}

Java

如要向 Dialogflow 進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證」。


import com.google.api.gax.rpc.ApiException;
import com.google.cloud.dialogflow.v2.AudioEncoding;
import com.google.cloud.dialogflow.v2.DetectIntentRequest;
import com.google.cloud.dialogflow.v2.DetectIntentResponse;
import com.google.cloud.dialogflow.v2.InputAudioConfig;
import com.google.cloud.dialogflow.v2.QueryInput;
import com.google.cloud.dialogflow.v2.QueryResult;
import com.google.cloud.dialogflow.v2.SessionName;
import com.google.cloud.dialogflow.v2.SessionsClient;
import com.google.protobuf.ByteString;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;

public class DetectIntentAudio {

  // DialogFlow API Detect Intent sample with audio files.
  public static QueryResult detectIntentAudio(
      String projectId, String audioFilePath, String sessionId, String languageCode)
      throws IOException, ApiException {
    // Instantiates a client
    try (SessionsClient sessionsClient = SessionsClient.create()) {
      // Set the session name using the sessionId (UUID) and projectID (my-project-id)
      SessionName session = SessionName.of(projectId, sessionId);
      System.out.println("Session Path: " + session.toString());

      // Note: hard coding audioEncoding and sampleRateHertz for simplicity.
      // Audio encoding of the audio content sent in the query request.
      AudioEncoding audioEncoding = AudioEncoding.AUDIO_ENCODING_LINEAR_16;
      int sampleRateHertz = 16000;

      // Instructs the speech recognizer how to process the audio content.
      InputAudioConfig inputAudioConfig =
          InputAudioConfig.newBuilder()
              .setAudioEncoding(
                  audioEncoding) // audioEncoding = AudioEncoding.AUDIO_ENCODING_LINEAR_16
              .setLanguageCode(languageCode) // languageCode = "en-US"
              .setSampleRateHertz(sampleRateHertz) // sampleRateHertz = 16000
              .build();

      // Build the query with the InputAudioConfig
      QueryInput queryInput = QueryInput.newBuilder().setAudioConfig(inputAudioConfig).build();

      // Read the bytes from the audio file
      byte[] inputAudio = Files.readAllBytes(Paths.get(audioFilePath));

      // Build the DetectIntentRequest
      DetectIntentRequest request =
          DetectIntentRequest.newBuilder()
              .setSession(session.toString())
              .setQueryInput(queryInput)
              .setInputAudio(ByteString.copyFrom(inputAudio))
              .build();

      // Performs the detect intent request
      DetectIntentResponse response = sessionsClient.detectIntent(request);

      // Display the query result
      QueryResult queryResult = response.getQueryResult();
      System.out.println("====================");
      System.out.format("Query Text: '%s'\n", queryResult.getQueryText());
      System.out.format(
          "Detected Intent: %s (confidence: %f)\n",
          queryResult.getIntent().getDisplayName(), queryResult.getIntentDetectionConfidence());
      System.out.format(
          "Fulfillment Text: '%s'\n",
          queryResult.getFulfillmentMessagesCount() > 0
              ? queryResult.getFulfillmentMessages(0).getText()
              : "Triggered Default Fallback Intent");

      return queryResult;
    }
  }
}

Node.js

如要向 Dialogflow 進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證」。

const fs = require('fs');
const util = require('util');
const {struct} = require('pb-util');
// Imports the Dialogflow library
const dialogflow = require('@google-cloud/dialogflow');

// Instantiates a session client
const sessionClient = new dialogflow.SessionsClient();

// The path to identify the agent that owns the created intent.
const sessionPath = sessionClient.projectAgentSessionPath(
  projectId,
  sessionId
);

// Read the content of the audio file and send it as part of the request.
const readFile = util.promisify(fs.readFile);
const inputAudio = await readFile(filename);
const request = {
  session: sessionPath,
  queryInput: {
    audioConfig: {
      audioEncoding: encoding,
      sampleRateHertz: sampleRateHertz,
      languageCode: languageCode,
    },
  },
  inputAudio: inputAudio,
};

// Recognizes the speech in the audio and detects its intent.
const [response] = await sessionClient.detectIntent(request);

console.log('Detected intent:');
const result = response.queryResult;
// Instantiates a context client
const contextClient = new dialogflow.ContextsClient();

console.log(`  Query: ${result.queryText}`);
console.log(`  Response: ${result.fulfillmentText}`);
if (result.intent) {
  console.log(`  Intent: ${result.intent.displayName}`);
} else {
  console.log('  No intent matched.');
}
const parameters = JSON.stringify(struct.decode(result.parameters));
console.log(`  Parameters: ${parameters}`);
if (result.outputContexts && result.outputContexts.length) {
  console.log('  Output contexts:');
  result.outputContexts.forEach(context => {
    const contextId =
      contextClient.matchContextFromProjectAgentSessionContextName(
        context.name
      );
    const contextParameters = JSON.stringify(
      struct.decode(context.parameters)
    );
    console.log(`    ${contextId}`);
    console.log(`      lifespan: ${context.lifespanCount}`);
    console.log(`      parameters: ${contextParameters}`);
  });
}

Python

如要向 Dialogflow 進行驗證，請設定應用程式預設憑證。詳情請參閱「為本機開發環境設定驗證」。

def detect_intent_audio(project_id, session_id, audio_file_path, language_code):
    """Returns the result of detect intent with an audio file as input.

    Using the same `session_id` between requests allows continuation
    of the conversation."""
    from google.cloud import dialogflow

    session_client = dialogflow.SessionsClient()

    # Note: hard coding audio_encoding and sample_rate_hertz for simplicity.
    audio_encoding = dialogflow.AudioEncoding.AUDIO_ENCODING_LINEAR_16
    sample_rate_hertz = 16000

    session = session_client.session_path(project_id, session_id)
    print("Session path: {}\n".format(session))

    with open(audio_file_path, "rb") as audio_file:
        input_audio = audio_file.read()

    audio_config = dialogflow.InputAudioConfig(
        audio_encoding=audio_encoding,
        language_code=language_code,
        sample_rate_hertz=sample_rate_hertz,
    )
    query_input = dialogflow.QueryInput(audio_config=audio_config)

    request = dialogflow.DetectIntentRequest(
        session=session,
        query_input=query_input,
        input_audio=input_audio,
    )
    response = session_client.detect_intent(request=request)

    print("=" * 20)
    print("Query text: {}".format(response.query_result.query_text))
    print(
        "Detected intent: {} (confidence: {})\n".format(
            response.query_result.intent.display_name,
            response.query_result.intent_detection_confidence,
        )
    )
    print("Fulfillment text: {}\n".format(response.query_result.fulfillment_text))

其他語言

C#：請按照用戶端程式庫頁面上的C# 設定說明操作，然後前往 .NET 適用的 Dialogflow 參考說明文件。

PHP：請按照用戶端程式庫頁面上的 PHP 設定說明操作，然後前往 PHP 適用的 Dialogflow 參考文件。

Ruby：請按照用戶端程式庫頁面的 Ruby 設定說明操作，然後前往 Ruby 適用的 Dialogflow 參考說明文件。

使用 FieldMask 更新資料

串流輸入音訊來偵測意圖

利用音訊輸入檔案偵測意圖 透過集合功能整理內容 你可以依據偏好儲存及分類內容。

事前準備

建立虛擬服務專員

將範例檔案匯入代理程式

偵測意圖

REST

curl (Linux、macOS 或 Cloud Shell)

PowerShell (Windows)

APIs Explorer (瀏覽器)

Go

Java

Node.js

Python

其他語言

利用音訊輸入檔案偵測意圖